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METHODS OF DIAGNOSIS AND TREATMENT OF ANDROGEN-DEPENDENT 
PROSTATE CANCER, PROSTATE CANCER UNDERGOING ANDROGEN 
WITHDRAWAL, AND ANDROGEN-INDEPENDENT PROSTATE CANCER 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application claims priority from the following applications: USSN 60/295,917, 
filed June 4, 2001, USSN 60/368,689, filed March 29, 2002; USSN 60/350,666, filed 
November 13, 2001; and USSN 60/372,246, filed April 12, 2002; each of which is 
incorporated herein by reference in its entirety. 

FIELD OF THE INVENTION 
. The invention relates to the identification of nucleic acid and protein expression 
profiles and nucleic acids, products, and antibodies thereto that are involved in prostate 
cancer; and to the use of such expression profiles and compositions in the diagnosis, 
prognosis, and therapy of prostate cancer. The invention further relates to methods for 
identifying and using agents and/or targets that inhibit prostate cancer. 

BACKGROUND OF THE INVENTION 
Prostate cancer is the most frequently diagnosed cancer and the second leading cause 
of male cancer death in North America and northern Europe. Early detection of prostate 
cancer using a serum test for prostate-specific antigen (PSA) has dramatically improved the 
treatment of the disease (Oesterling (1992) J. Am. Med. Assoc. 267:2236-2238). Treatment 
of prostate cancer consists largely of surgical prostatectomy, radiation therapy, androgen 
ablation therapy and chemotherapy. Although many prostate cancer patients are effectively 
treated, the current therapies can all induce serious side effects which diminish quality of life. 
Patients who present with metastatic disease are most often treated with androgen-ablation 
therapy. Hormone blockade results in significant regression of the tumor. However, this 
treatment rarely cures the patient and' invariably results in progression to androgen- 



WO 02/098358 PCT/US02/17594 

independent disease, which is incurable. Afrin and Stuart (1994) J.S.C. Med. Assoc. 90:23 1 - 
236. 

The identification of novel therapeutic targets and diagnostic markers is essential for 
improving the current treatment of prostate cancer patients. Recent advances in molecular 
5 medicine have increased the interest in tumor-specific cell surface antigens that could serve 
as targets for various immunotherapeutic or small molecule strategies. Antigens suitable for 
immunotherapeutic strategies should be highly expressed in cancer tissues and ideally not 
expressed in normal adult tissues. Expression in tissues that are dispensable for life, 
however, may be tolerated. Examples of such antigens include Her2/neu and the B-cell 

10 antigen CD20. Humanized monoclonal antibodies directed to Her2/neu (Herceptin) are 

currently in use for the treatment of metastatic breast cancer. Ross and Fletcher (1998) Stem 
Cells 16:413-428. Similarly, anti-CD20 monoclonal antibodies (Rituxin) are used to 
effectively treat non-Hodgkin's lymphoma. Maloney, et al. (1997) Blood 90:2188-2195; 
Leget and Czuczman (1998) Curr. Qpin. Oncol. 10:548-551 . 

15 Several potential immunotherapeutic targets have been identified for prostate cancer. 

They include prostate-specific membrane antigen (PSMA) (Israeli, et al. (1993) Cancer Res. 
53:227-230), prostate stem cell antigen (PSCA; Reiter, et al. (1998) Proc. Natl. Acad. Sci. 
USA 95:1735-1740), and serpentine transmembrane epithelial antigen of the prostate 
(STEAP; Hubert, et al. (1999) Proc. Natl. Acad. Sci. USA 96: 14529-14534). PSMA is a type 

20 II transmembrane hydrolase with significant homology to a rat neuropeptidase (Carter, et al. 
(1996) Proc. Natl. Acad. Sci. USA 93:749-753). Antibodies directed towards PSMA are 
currently being used to detect metastasized prostate cancer as the Prostascint Scan (Sodee, et 
al. (1996) Clin. Nucl. Med. 21:759-767) and are also being evaluated for treatment of 
advanced disease (Gregorakis, et al. (1998) Semin. Urol. Oncol. 16:2-12; Liu, et al. (1998) 

25 Cancer Res. 58:4055-4060; Murphy, et al. (1998) J. Urol. 160:2396-2401). In a study on 

bone metastasis of prostate cancer, only 8 out of 18 patient samples expressed PSMA (Silver, 
et al. (1997) Clin. Cancer Res. 3:81-85). Therefore, it is clear that other targets need to be 
identified to manage metastasized disease. PSCA is a member of the Thy-l/Ly-6 family of 
glycosylphosphatidylinositol- linked plasma membrane proteins (Reiter, et al. (1998) Proc. 

30 Natl. Acad. Sci. USA 95: 1735-1740). Immunohistochemical data shows that PSCA is up- 
regulated in the majority of prostate cancer epithelia and is also detected in bone metastasis 
(Gu, et al. (2000) Oncogene 19:1288-1296). Recent work shows that antibodies directed to 
• 2 
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PSCA can prevent metastatic spread of prostate cancer in a mouse model (Saffran, et al. 
(2001) Proc. Natl. Acad. Sci. USA 98:2658-2663). STEAP is a multi-transmembrane 
prostate-specific protein that may function as a channel or transporter protein (Hubert, et al. 
(1999) Proc. Natl. Acad. Sci. USA 96:14529-14534). Its protein expression is specific to the 
basolateral membranes of normal prostate and prostate cancer epithelia. STEAP expression 
was most highly concentrated at cell-cell boundaries, implying a potential function in 
intercellular communication. Therapeutic monoclonal antibodies have so far not been 
reported for STEAP. 

SUMMARY OF THE INVENTION 

The present invention therefore provides nucleotide sequences of genes that are up- 
and down-regulated in androgen-independent prostate cancer cells or prostate cells 
undergoing androgen withdrawal. Such genes are useful for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate prostate cancer, such as 
hormones or antibodies. Other aspects of the invention will become apparent to the skilled 
artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting an androgen 
independent prostate cancer-associated transcript in a cell from a patient, the method 
comprising contacting a biological sample from the patient with a polynucleotide that 
selectively hybridizes to nucleic acid molecule comprising a sequence at least 80% identical 
to a sequence as shown in Tables 1A-4. 

In one embodiment, the present invention provides a method of determining the level 
of a prostate cancer associated transcript in a cell from a patient. 

In one embodiment, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1A-4. 

In various embodiments, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1 A-4; the polynucleotide comprises a 
sequence as shown in Tables 1 A-4; the biological sample is a tissue sample; the biological 
sample comprises isolated nucleic acids, e.g., mRNA; the polynucleotide is labeled, e.g., with 
a fluorescent label; the polynucleotide is immobilized on a solid surface; the patient is 
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undergoing a therapeutic regimen to treat prostate cancer; the patient is suspected of having 
metastatic prostate cancer; the patient is a human; the patient is suspected of having a taxol- 
resistant cancer; or the prostate cancer associated transcript is mRNA. 

In other embodiments, the method further comprises the step of amplifying nucleic 
acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1A-4, thereby monitoring 
the efficacy of the therapy, hi a further embodiment, the patient has metastatic prostate 
cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the prostate cancer-associated transcript to a level of the prostate cancer-associated 
transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. 

Additionally, provided herein is a method of evaluating the effect of a candidate 
prostate cancer drug comprising administering the drug to a patient and removing a cell 
sample from the patient. The expression profile of the cell is then determined. This method 
may further comprise comparing the expression profile to an expression profile of a healthy 
individual. In a preferred embodiment, said expression profile includes a gene of Tables 1A- 
4. 

In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1A-4. 

In one embodiment, an expression vector or cell comprises the isolated nucleic acid. 

In one aspect, the present invention provides an isolated polypeptide which is encoded 
by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1 A-4. 

In another aspect, the present invention provides an antibody that specifically binds to 
an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1A-4. 

4 
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In certain embodiments, the antibody is conjugated to an effector component, e.g., a 
fluorescent label, a radioisotope or a cytotoxic chemical; the antibody is an antibody 
fragment; or the antibody is humanized. 

In one aspect, the present invention provides a method of detecting a prostate cancer 
cell in a biological sample from a patient, the method comprising contacting the biological 
sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting antibodies 
specific to prostate cancer in a patient, the method comprising contacting a biological sample 
from the patient with a polypeptide encoded by a nucleic acid comprising a sequence from 
Tables 1A-4. 

In another aspect, the present invention provides a method for identifying a compound 
that modulates a prostate cancer-associated polypeptide, the method comprising the steps of: 
a) contacting the compound with a prostate cancer-associated polypeptide, the polypeptide 
encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables 1A-4; andb) determining the functional effect of the 
compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic effect, or a 
chemical effect. 

In one embodiment, the polypeptide is expressed in a eukaryotic host cell or cell 
membrane. In another embodiment, the polypeptide is recombinant. 

In one embodiment, the functional effect is determined by measuring ligand binding 
to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
of a prostate cancer-associated cell'to treat prostate cancer in a patient, the method 
comprising the step of administering to the subject a therapeutically effective amount of a 
compound identified as described herein. 

In one embodiment, the compound is an antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: a) administering a test compound to a mammal having prostate cancer or to a 
cell sample isolated therefrom; b) comparing the level of gene expression of a polynucleotide 
that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in 
Tables 1 A-4 in a treated cell or mammal with the level of gene expression of the 
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polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
cancer. 

hi one embodiment, the control is a mammal with prostate cancer or a cell sample 
therefrom that has not been treated with the test compound. In another embodiment, the 
control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
concentrations. In another embodiment, the test compound is administered for varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

In one embodiment, the levels of a plurality of polynucleotides that selectively 
hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1A-4 are 
individually compared to their respective levels in a control cell sample or mammal. In a 
preferred embodiment the plurality of polynucleotides is from three to ten. 

hi another aspect, the present invention provides a method for treating a mammal 
having prostate cancer comprising administering a compound identified by the assay 
described herein. 

In another aspect, the present invention provides a pharmaceutical composition for 
treating a mammal having prostate cancer, the composition comprising a compound 
identified by the assay described herein and a physiologically acceptable excipient. 

In one aspect, the present invention provides a method of screening drag candidates 
by providing a cell expressing a gene that is up- and down-regulated as in a prostate cancer, 
hi one embodiment, a gene is selected from Tables 1 A-4. The method further includes 
adding a drag candidate to the cell and determining the effect of the drug candidate on the 
expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes comparing the 
level of expression in the absence of the drug candidate to the level of expression in the 
presence of the drug candidate, wherein the concentration of the drug candidate can vary 
when present, and wherein the comparison can occur after addition or removal of the drug 
candidate. In a preferred embodiment, the cell expresses at least two expression profile 
genes. The profile genes may show an increase or decrease. 
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Also provided is a method of evaluating the effect of a candidate prostate cancer drug 
comprising administering the drug to a transgenic animal expressing or over-expressing the 
prostate cancer modulatory protein, or an animal lacking the prostate cancer modulatory 
protein, for example as a result of a gene knockout. 

Moreover, provided herein is a biochip comprising one or more nucleic acid segments 
of Tables 1A-4, wherein the biochip comprises fewer than 1000 nucleic acid probes. 
Preferably, at least two nucleic acid segments are included. More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate cancer is 
provided. The method comprises determining the expression of a gene of Tables 1A-4, in a 
first tissue type of a first individual, and comparing the distribution to the expression of the 
gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

In a further embodiment, the biochip also includes a polynucleotide sequence of a 
gene that is not up- and down-regulated in prostate cancer. 

In one embodiment a method for screening for a bioactive agent capable of interfering 
with the binding of a prostate cancer modulating protein (prostate cancer modulatory protein) 
or a fragment thereof and an antibody which binds to said prostate cancer modulatory protein 
or fragment thereof. In a preferred embodiment, the method comprises combining a prostate 
cancer modulatory protein or fragment thereof, a candidate bioactive agent and an antibody 
which binds to said prostate cancer modulatory protein or fragment thereof. The method 
further includes determining the binding of said prostate cancer modulatory protein or 
fragment thereof and said antibody. Wherein there is a change in binding, an agent is 
identified as an interfering agent. The interfering agent can be an agonist or an antagonist. 
Preferably, the agent inhibits prostate cancer. 

Also provided herein are methods of eliciting an immune response in an individual. 
In one embodiment a method provided herein comprises administering to an individual a 
composition comprising a prostate cancer modulating protein, or a fragment thereof. In 
another embodiment, the protein is encoded by a nucleic acid selected from those of Tables 
1A-4. 
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Further provided herein are compositions capable of eliciting an immune response in 
an individual. In one embodiment, a composition provided herein comprises a prostate 
cancer modulating protein, preferably encoded by a nucleic acid of Tables 1A-4, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1 A-4and a 
pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer protein, or a 
fragment thereof, comprising contacting an agent specific for said protein with said protein in 
an amount sufficient to effect neutralization. In another embodiment, the protein is encoded 
by a nucleic acid selected from those of Tables 1A-4. In another aspect of the invention, a 
method of treating an individual for prostate cancer is provided. In one embodiment, the 
method comprises administering to said individual an inhibitor of a prostate cancer 
modulating protein. In another embodiment, the method comprises administering to a patient 
having prostate cancer an antibody to a prostate cancer modulating protein conjugated to a 
therapeutic moiety. Such a therapeutic moiety can be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and evaluation of androgen-dependent prostate cells (malignant or 
non-malignant), prostate cells undergoing androgen withdrawal, and androgen-independent 
prostate cancer, as well as methods for treating androgen-dependent prostate cells (malignant 
or non-malignant), prostate cancer undergoing androgen withdrawal, and androgen- 
independent prostate cancer. The current Specification incorporates the text of USSN 
09/976,858, filed October 12, 2001, USSN 60/295,917, filed June 4, 2001, USSN 
60/368,689, filed March 29, 2002; USSN 60/350,666, filed November 13, 2001; and USSN 
60/372,246, filed April 12, 2002. 

Table 1A provides unigene cluster identification numbers for the nucleotide sequence 
of genes that exhibit increased or decreased expression in androgen-independent prostate 
cancer samples. Table 1A also provides an exemplar accession number that provides a 
nucleotide sequence that is part of the unigene cluster. The expression patterns of the genes 
of Table 1A can be broadly defined into the following categories: 
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Genes that are expressed early in the time course, then drop off in expression, and then 
express again with emergence of androgen-independence (hi-lo-hi pattern in table 1 A). 
Genes that are expressed early in the time course, then drop off in expression, and do not 
express again with emergence of androgen-independence (hi-lo-lo pattern in 1 A). Genes that 
5 are not expressed early in the time course, but express only with emergence of androgen- 
independence (lo-lo-hi pattern in table 1A). Genes that are not expressed early in the time 
course, but then express as androgen is withdrawn and continue to express with emergence of 
androgen-independence (lo-hi-hi pattern in table 1 A). Genes that are not expressed early in 
the time course, but then express as androgen is withdrawn and drop off again with 

10 emergence of androgen-independence (lo-hi-lo pattern in table 1 A). 

Tables 2 A-C provide unigene cluster identification numbers for the nucleotide 
sequence of genes that exhibit increased or decreased expression in androgen-dependent 
prostate cancer, prostate cancer undergoing androgen withdrawal and androgen-independent 
prostate cancer. Tables 2A-C also provide an exemplar accession number that provides a 

1 5 nucleotide sequence that is part of the unigene cluster. The expression patterns of the genes 
of Tables 2A-C can be broadly defined into the following 6 categories: 

Genes that are expressed early in the time course of androgen withdrawal, then drop 
off in expression, and then express again with emergence of androgen-independence (hi-lo- 
lo-hi pattern in Table 2A). Genes that are expressed early in the time course, then drop off in 

20 expression immediately after androgen-withdrawal, and do not express again with emergence 
of androgen-independence (hi-lo-lo-lo pattern in Table 2A). Genes that are expressed early 
in the time course, then drop off in expression after several days of androgen withdrawal, and 
do not express again with emergence of androgen-independence (hi-hi-lo-lo pattern in Table 
2A). Genes that are not expressed early in the time course, but express only with emergence 

25 of androgen-independence (lo-lo-lo-hi pattern in Table 2A). Genes that are not expressed 
early in the time course, but then express as androgen is withdrawn and continue to express 
with emergence of androgen-independence (lo-lo-hi-hi pattern in Table 2A). Genes that are 
not expressed early in the time course, but then express as androgen is withdrawn and drop 
off again with emergence of androgen-independence (lo-lo-hi-lo pattern in Table 2A). 

30 

Definitions 



9 
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The term "androgen ablation therapy" refers to techniques for the removal or 
destruction of sources of male hormones, such as testosterone. These techniques include, for 
example, 1) surgical removal of the testicles, 2) medications such as gonadatropin releasing 
hormone analogs that inhibit testosterone production, or 3) anti-androgenic drugs that block 
5 androgen receptors. 

The term "androgen-independent prostate cancer protein" or "androgen-independent 
prostate cancer polynucleotide" or "androgen-independent prostate cancer-associated 
transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, mutants, and 
interspecies homologues that: (1) have a nucleotide sequence that has greater than about 60% 

10 nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide sequence identity, preferably over a 
region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to a 
nucleotide sequence of or associated with a unigene cluster of Tables 1A-4; (2) bind to 
antibodies, e.g., polyclonal antibodies, raised against an immunogen comprising an amino 

15 acid sequence encoded by a nucleotide sequence of or associated with a unigene cluster of 
Tables 1A-4 and conservatively modified variants thereof; (3) specifically hybridize under 
stringent hybridization conditions to a nucleic acid sequence, or the complement thereof of 
Tables 1A-4 and conservatively modified variants thereof; or (4) have an amino acid 
sequence that has greater than about 60% amino acid sequence identity, 65%, 70%, 75%, 

20 80%, 85%, 90%, preferably 9 1 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater 
amino sequence identity, preferably over a region of over a region of at least about 25, 50, 
100, 200, 500, 1000, or more amino acid, to an amino acid sequence encoded by a nucleotide 
sequence of or associated with a unigene cluster of Tables 1A-4. These polynucleotides or 
proteins may also be expressed during a period following androgen withdrawal. A 

25 polynucleotide or polypeptide sequence is typically from a mammal including, but not 

limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms, and may refer to those polypeptides 
or polynucleotides which are expressed in prostate proliferative cells. 

30 A "full length" prostate cancer protein or nucleic acid refers to a prostate cancer 

polypeptide or polynucleotide sequence, or a variant thereof, that contains the elements 
normally contained in one or more naturally occurring, wild type prostate cancer 
10 
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polynucleotide or polypeptide sequences. The "full length" may be prior to, or after, various 
stages of post-translation processing or splicing, including alternative splicing. 

"Biological sample" as used herein is a sample of biological tissue or fluid that 
contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
5 transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histology purposes, 
blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 

1 0 biological sample is typically obtained from a eukaryotic organism, most preferably a 

mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

"Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 

15 cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), by collecting a 
sample which contains a soluble polypeptide or nucleic acid derived from a prostate cell, or 
by performing the methods of the invention in vivo. Archival tissues, having treatment or 
outcome history, will be particularly useful. 

20 The terms "identical" or percent "identity," in the context of two or more nucleic 

acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
same or have a specified percentage of amino acid residues or nucleotides that are the same 
(i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned 

25 for maximum correspondence over a comparison window or designated region) as measured 
using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters 
described below, or by manual alignment and visual inspection (see, e.g., NCBI web site 
http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to be 
"substantially identical." This definition also refers to, or maybe applied to, the compliment 

30 of a test sequence. The definition also includes sequences that have deletions and/or 
additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
11 
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algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, to 
which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of one of the 
number of contiguous positions selected from the group consisting typically of from 20 to 
600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
maybe compared to a reference sequence of the same number of contiguous positions after 
the two sequences are optimally aligned. Methods of alignment of sequences for comparison 
are well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Appl. Math. 2:482, by 
the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443- 
453, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat'l. Acad. 
Sci. USA 85:2444-2448, by computerized implementations of these algorithms (GAP, 
BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics 
Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and visual 
inspection (see, e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in 
Molecular Biology Lippincott). 

Preferred examples of algorithms that are suitable for determining percent sequence 
identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 
described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 
J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 
herein, to determine percent sequence identity for the nucleic acids and proteins of the 
invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
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words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915-919) alignments (B) 
of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity between 
two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match between 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values maybe large 
negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
13 



WO 02/098358 



PCT/US02/17594 



polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
5 same primers can be used to amplify the sequences. 

A "host cell" is a naturally occurring cell or a transformed cell that contains an 
expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 

10 mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 
Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 

15 techniques such as polyacrylamide gel electrophoresis or high performance liquid 

chromatography. A protein or nucleic acid that is the predominant species present in a 
preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 

20 or protein gives rise to essentially one band in an clcctrophoretic gel. Preferably, it means 
that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

25 The terms "polypeptide," "peptide," and "protein" are used interchangeably herein to 

refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. Certain diagnostic 

30 methods may evaluate secreted or breakdown products present only because the producing 
cell is present, and would otherwise be absent in a normal individual. 
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The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 
occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
5 carboxyglutamate, and 0-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 

10 chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

Amino acids may be referred to herein by either their commonly known three letter 
symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 

15 Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 
sequences. With respect to particular nucleic acid sequences, conservatively modified 
valiants refers to those nucleic acids which encode identical or essentially identical amino 

20 acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences.. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 
most proteins. For instance, the codons GCA, GCC, GCG, and GCU encode the amino acid 
alanine. Thus, at every position where an alanine is specified by a codon, the codon can be 

25 altered to another of the corresponding codons described without altering the encoded 

polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 
polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 

30 only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
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a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 
deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 
alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitutions providing 
functionally similar amino acids are well known in the art. Such conservatively modified 
variants are in addition to and do not exclude polymorphic variants, interspecies homologs, 
and alleles of the invention, typically conservative substitutions for one another: 1) Alanine 
(A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 
4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) 
Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) 
Cysteine (C), Methionine (M) (see, e.g., Creighton (1984) Proteins Freeman). 

Macromolecular structures such as polypeptide structures can be described in terms of 
various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (2001) Molecular Biology of the Cell (4th ed.) and Cantor and Schimmel 
(1980) Biophysical Chemistry Part I: The Conformation of Biological Macromolecules 
Freeman. "Primary structure" refers to the amino acid sequence of a particular peptide. 
"Secondary structure" refers to locally ordered, three dimensional structures within a 
polypeptide. These structures are commonly known as domains. Domains are portions of a 
polypeptide that often form a compact unit of the polypeptide and are typically 25 to 
approximately 500 amino acids long. Typical domains are made up of sections of lesser 
organization such as stretches of P-sheet and a-helices. "Tertiary structure" refers to the 
complete three dimensional structure of a polypeptide monomer. "Quaternary structure" 
refers to the three dimensional structure formed, usually by the noncovalent association of 
independent tertiary units. Anisotropic terms are also known as energy terms. 

"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents 
used herein means at least two nucleotides covalently linked together. Oligonucleotides are 
typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 
to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of 
virtually any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 
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7000, 10,000, etc. A nucleic acid of the present invention will generally contain 
phosphodiester bonds, although in some cases, nucleic acid analogs are included that may 
have alternate backbones, comprising, e.g., phosphoramidate, phosphorothioate, 
phosphorodithioate, or O-methylphosphoroamidite linkages (see Eckstein (1992) 
Oligonucleotides and Analogues: A Practical Approach , Oxford University Press); and 
peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Sanghvi and Cook (eds. 1994) Carbohydrate Modifications in 
Antisense Research ACS Symposium Series 580. Nucleic acids containing one or more 
carbocyclic sugars are also included within one definition of nucleic acids. Modifications of 
the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the 
stability and half-life of such molecules in physiological environments or as probes on a 
biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 
nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for example, 
phosphoramidate (Beaucage, et al. (1993) Tetrahedron 49(10):1925-1963 and references 
therein; Letsinger (1970) J. Org. Chem. 35:3800-3803; Sprinzl, et al. (1977) Eur. J. Biochem. 
81:579-589; Letsinger, et al. (1986) Nucl. Acids Res. 14:3487-499; Sawai, et al (1984) 
Chem. Lett. 805, Letsinger, et al. (1988) J. Am. Chem. Soc. 1 10:4470-4471; and Pauwels, et 
al. (1986) Chemica Scripta 26:141-149), phosphorothioate (Mag, et al. (1991) Nucleic Acids 
Res. 19:1437-441; and U.S. Patent No. 5,644,048), phosphorodithioate (Briu, et al. (1989) J. 
Am. Chem. Soc , 1 1 1 :2321-xxx, O-methylphosphoroamidite linkages (see Eckstein (1992) 
Oligonucleotides and Analogues: A Practical Approach Oxford University Press), and 
peptide nucleic acid backbones and linkages (see Egholm (1992) J. Am. Chem. Soc. 
114:1895-1897; Meier, et al. (1992) Chem. Int. Ed. Engl. 31:1008-1010: Nielsen (1993) 
Nature 365:566-568; Carlsson, et al. (1996) Nature 380:207, each of which is incorporated by 
reference). Other analog nucleic acids include those with positive backbones (Denpcy, et al. 
(1995) Proc. Natl. Acad. Sci. USA 92:6097-101; non-ionic backbones (U.S. Patent Nos. 
5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Kiedrowshi, et al. (1991) Angew 
Chem. Intl. Ed. English 30:423-426; Letsinger, et al. (1988) J. Am. Chem. Soc. 1 10:4470; 
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Letsinger, et al. (1994) Nucleoside and Nucleotide 13:1597-xxx; Chapters 2 and 3 in 
Sanghvi and Cook (eds. 1994) Carbohydrate Modifications in Antisense Research ACS 
Symposium Series 580; Mesmaeker, et al. (1994) Bioorganic and Medicinal Chem. Lett. 
4:395-xxx; Jeffs, et al. (1994) J. Biomolecular NMR 34:17; Horn (1996) Tetrahedron Lett. 
5 37:743-xxx) and non-ribose backbones, including those described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7 in Sanghvi and Cook (eds. 1994) 
Carbohydrate Modifications in Antisense Research ACS Symposium Series 580. Nucleic 
acids containing one or more carbocyclic sugars are also included within one definition of 
nucleic acids (see Jenkins, et al. (1995) Chem. Soc. Rev. xx:169-176). Several nucleic acid 

10 analogs are described in Rawls (p. 35, June 2, 1997) C&E News . Each of these references is 
hereby expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
' contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 

15 This results in two advantages. First, the PNA backbone exhibits improved hybridization 
kinetics. PNAs have larger changes in the melting temperature (T m ) for mismatched versus 
perfectly matched base pairs. DNA and RNA typically exhibit a 2-4° C drop in T ra for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 

20 backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 
by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 

25 strand; thus the sequences described herein also provide the complement of the sequence. 

The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, where the nucleic 
acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of 
bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, 
isocytosine, isoguanine, etc. "Transcript" typically refers to a naturally occurring RNA, e.g., 

30 a pre-mRNA, hnRNA, or mRNA. As used herein, the term "nucleoside" includes nucleotides 
and nucleoside and nucleotide analogs, and modified nucleosides such as amino modified 
nucleosides. In addition, "nucleoside" includes non-naturally occurring analog structures. 
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Thus, e.g., the individual units of a peptide nucleic acid, each containing a base, are referred 
to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, chemical, or other physical means. For 
5 example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., 
as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins or other entities 
which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to 
detect antibodies specifically reactive with the peptide. The labels may be incorporated into 
the prostate cancer nucleic acids, proteins, and antibodies at virtually any position. Many 

10 methods for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature . 144:945; David, et al. (1974) Biochemistry 
13:1014-1021; Pain, et al. (1981) J. Immunol. Metli. 40:219-230; and Nygren (1982) I 
Histochem. and Cytochem. 30:407-412. 

An "effector" or "effector moiety" or "effector component" is a molecule that is 

15 bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 

20 radioisotope emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 

25 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a nucleic acid 
capable of binding to a target nucleic acid of complementary sequence through one or more 
types of chemical bonds, usually through complementary base pairing, usually through 

30 hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or 
modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be 
joined by a linkage other than a phosphodiester bond, so long as it does not functionally 
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interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the 
constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be 
understood by one of skill in the art that probes may bind target sequences lacking complete 
complementarity with the probe sequence depending upon the stringency of the hybridization 
conditions. The probes are preferably directly labeled as with isotopes, chromophores, 
luniiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin 
complex may later bind. By assaying for the presence or absence of the probe, one can detect 
the presence or absence of the select sequence or subsequence. Diagnosis or prognosis may 
be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 
the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this maimer, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

The term "heterologous" when used with reference to portions of a nucleic acid 
indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
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arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 
synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 

The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to no other sequences. Stringent conditions are sequence-dependent and will be 
different in different circumstances. Longer sequences hybridize specifically at higher 
temperatures. An extensive guide to the hybridization of nucleic acids is found "Overview of 
principles of hybridization and the strategy of nucleic acid assays" in Tijssen (1993) 
Hybridization with Nucleic Probes (Techniques in Biochemistry and Molecular Biology vol. 
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24} Elsevier. Generally, stringent conditions are selected to be about 5-10° C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The T m is 
the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 
salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60° C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background, preferably 10 times background hybridization. Exemplary stringent 
hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, and 
0.1% SDS at 65° C. For PCR, a temperature of about 36° C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32° C and 48° C 
depending on primer length. For high stringency PCR amplification, a temperature of about 
62° C is typical, although high stringency annealing temperatures can range from about 50- 
65° C, depending on the primer length and specificity. Typical cycle conditions for both high 
and low stringency amplifications include a denaturation phase of 90-95° C for 30-120 sec, 
an annealing phase lasting 30-120 sec, and an extension phase of about 72° C for 1-2 min. 
Protocols and guidelines for low and high stringency amplification reactions are provided, 
e.g., in Innis, et al. (1990) PCR Protocols: A Guide to Methods and Applications Academic 
Press, N.Y. 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 
permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 
moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37° C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 
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background. Those of ordinary skill will readily recognize that alternative hybridization and 
wash conditions can be utilized to provide conditions of similar stringency. Additional 
guidelines for determining hybridization parameters are provided in numerous references, 
e.g., Ausubel, et al. (eds. 1991 and supplements) Current Protocols in Molecular Biology 
5 The phrase "functional effects" in the context of assays for testing compounds that 

modulate activity of a prostate cancer protein includes the determination of a parameter that 
is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, e.g., 
a functional, physical, or chemical effect, such as the ability to decrease prostate proliferation 
(malignant or non-malignant). It includes ligand binding activity; cell growth on soft agar; 

10 anchorage dependence; contact inhibition and density limitation of growth; cellular 

proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of prostate cancer 
cells. "Functional effects" include in vitro, in vivo, and ex vivo activities. 

1 5 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a 
prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 

20 hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 
measuring binding activity or binding assays, e.g., binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 

25 the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 
transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 

30 The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
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measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, P-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of prostate cancer polynucleotide and 
polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 
compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide and 
polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic acids 
may seem to inhibit expression and subsequent function of the protein. "Activators" are 
compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, or 
up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also include 
genetically modified versions of prostate cancer proteins, e.g., versions with altered activity, 
as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, small 
chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 
expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 
or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1A-4. 

Samples or assays comprising prostate cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 
inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a prostate cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 
preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 
1000-3000% higher. 
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The phrase "changes in cell growth" refers to a change in cell growth and 
proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 
anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 
density limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 
ability to form or suppress tumors when injected into suitable animal hosts, and/or 
immortalization of the cell. See, e.g., pp. 23 1-241 in Freshney (1994) Culture of Animal 
Cells: A Manual of Basic Technique (3d ed.) Wiley-Liss. 

"Tumor cell" refers to precancerous, cancerous, and/or normal cells in a tumor. 

"Cancer cells," "transformed" cells, or "transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material. Although transformation can arise from infection with a transforming virus 
and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy. See, Freshney 
(2001) Culture of Animal Cells: A Manual of Basic Technique (4th ed.) Wiley-Liss. 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul 
(ed. 1999) Fundamental Immunology (4th ed.) Raven. 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 
tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 1 10 or more amino acids primarily responsible 
for antigen recognition. The terms variable light chain (V L ) and variable heavy chain (V H ) 
refer to these light and heavy chains respectively. 
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Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized 
fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 
antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a dimer of Fab 
which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 may be 
reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 
converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab 
with part of the hinge region (see Paul (ed. 1993) Fundamental Immunology (3d ed.) Raven. 
While various antibody fragments are defined in terms of the digestion of an intact antibody, 
one of skill will appreciate that such fragments may be synthesized de novo either chemically 
or by using recombinant DNA methodology. Thus, the term antibody, as used herein, also 
includes antibody fragments either produced by the modification of whole antibodies, or 
those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or 
those identified using phage display libraries (see, e.g., McCafferty, et al.(1990) Nature 
348:552-554. 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Immunologv Today 4:72; pp. 77-96 in 
Cole, et al. (1985) Monoclonal Antibodies and Cancer Therapy Liss; Coligan (1991) Current 
Protocols in Immunology Lippincott; Harlow and Lane (1988) Antibodies: A Laboratory 
Manual CSH Press; and Goding (1986) Monoclonal Antibodies: Principles and Practice (2d 
ed.) Academic Press. Techniques for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce antibodies to polypeptides of this invention. Also, 
transgenic mice, or other organisms such as other mammals, may be used to express 
humanized antibodies. Alternatively, phage display technology can be used to identify 
antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., 
McCafferty, et al. (1990) Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779- 
783). 

A "chimeric antibody" is an antibody molecule in which (a) the constant region, or a 
portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable 
region) is linked to a constant region of a different or altered class, effector function and/or 
species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
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region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

Identification of prostate cancer-associated sequences 
5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have a particular gene similarly expressed, the evaluation of a 
number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from pathological prostate cells, e.g., cancerous or metastatic 
cancerous tissue of the prostate, or prostate cancer tissue or metastatic prostate cancerous 
tissue can be compared with tissue samples of prostate and other tissues from surviving 
cancer patients. By comparing expression profiles of tissue in known different prostate 

15 cancer states, information regarding which genes are important (including both up- and 
down-regulation of genes) in each of these states is obtained. 

The identification of sequences that are differentially expressed in prostate cancer 
versus non-prostate cancer tissue allows the use of this information in a number of ways. For 
example, a particular treatment regime may be evaluated: does a chemotherapeutic drug act 

20 to down-regulate prostate cancer or other proliferative disorders, and thus tumor growth or 
recurrence, in a particular patient. Alternatively, a treatment step may induce other markers 
which may be used as targets to destroy tumor cells. Similarly, diagnosis and treatment 
outcomes may be done or confirmed by comparing patient samples with the known 
expression profiles. Maliganant disease may be compared to non-malignant conditions. 

25 Metastatic tissue can also be analyzed to determine the stage of prostate cancer in the tissue, 
or origin of primary tumor, e.g., metastasis from a remote primary site. Furthermore, these 
gene expression profiles (or individual genes) allow screening of drug candidates with an eye 
to mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 

30 comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
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candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

Thus the present invention provides nucleic acid and protein sequences that are 
differentially expressed in prostate cancer relative to normal tissues and/or non-malignant 
disease, or in different types of related diseases, herein termed "prostate cancer sequences." 
As outlined below, prostate cancer sequences include those that are up-regulated (i.e., 
expressed at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., 
expressed at a lower level). In a preferred embodiment, the prostate cancer sequences are 
from humans; however, as will be appreciated by those in the art, prostate cancer sequences 
from other organisms may be useful in animal models of disease and drug evaluation; thus, 
other prostate cancer sequences are provided, from vertebrates, including mammals, 
including rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including 
sheep, goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer 
sequences from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid sequences. 
As will be appreciated by those in the art and is more fully outlined below, prostate cancer 
nucleic acid sequences are useful in a variety of applications, including diagnostic 
applications, which will detect naturally occurring nucleic acids, as well as screening 
applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates with 
selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic acid 
and/or amino acid sequence homology to the prostate cancer sequences outlined herein. Such 
homology can be based upon the overall nucleic acid or amino acid sequence, and is 
generally determined as outlined below, using either homology programs or hybridization 
conditions. 

For identifying prostate cancer-associated sequences, the prostate cancer screen 
typically includes comparing genes identified in different tissues, e.g., normal and cancerous 
tissues, or tumor tissue samples from patients who have metastatic disease vs. non metastatic 
tissue. Other suitable tissue comparisons include comparing prostate cancer samples with 
metastatic cancer samples from other cancers, such as lung, breast, gastrointestinal cancers, 
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ovarian, etc. Samples of different stages of prostate cancer, e.g., survivor tissue, drug 
resistant states, and tissue undergoing metastasis, are applied to biochips comprising nucleic 
acid probes. The samples are first microdissected, if applicable, and treated as is known in 
the art for the preparation of mRNA. Suitable biochips are commercially available, e.g., from 
Affymetrix. Gene expression profiles are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, muscle, 
colon, small intestine, large intestine, spleen, bone, and placenta. In a preferred embodiment, 
those genes identified during the prostate cancer screen that are expressed in a significant 
amount in other tissues are removed from the profile, although in some embodiments, this is 
not necessary. That is, when screening for drags, it is usually preferable that the target be 
disease specific, to minimize possible side effects on other organs were there expression. 

In a preferred embodiment, prostate cancer sequences are those that are up-regulated 
in prostate cancer or related conditions; that is, the expression of these genes is higher in the 
prostate cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein 
often means at least about a two-fold change, preferably at least about a three fold change, 
with at least about five- fold or higher being preferred. Another embodiment is directed to 
sequences up-regulated in non-malignant conditions relative to normal. 

Unigene cluster identification numbers and accession numbers herein are for the 
GenBank sequence database and the sequences of the accession numbers are hereby 
expressly incorporated by reference. GenBank is known in the art, see, e.g., Benson, et al. 
(1998) Nucleic Acids Research 26:1-7 and http://www.ncbi.nlm.nih.gov/. Sequences are also 
available in other databases, e.g., European Molecular Biology Laboratory (EMBL) and 
DNA Database of Japan (DDBJ). U.S. Patent Application N. 09/687,576 and 09/976,858 (- 
001-3) further disclose related sequences, compositions, and methods of diagnosis and 
treatment of prostate cancer and related conditions and are hereby expressly incorporated by 
reference. 

hi another preferred embodiment, prostate cancer sequences are those that are down- 
regulated in the prostate cancer; that is, the expression of these genes is lower in prostate 
cancer tissue as compared to non-cancerous tissue. "Down-regulation" as used herein often 
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means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. 

Informatics 

5 The ability to identify genes that are over or under expressed in prostate cancer can 

additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 
biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with prostate cancer. Or as another 

10 example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets. Mechanism, and 
Function , paper presented at the IBC Proteomics conference, Coronado, CA (June 11-12, 
1998)). Subcellular toxicological information can also be utilized in a biological sensor 
device to predict the likely toxicological effect of chemical exposures and likely tolerable 

15 exposure thresholds (see U.S. Patent No. 5,81 1,23 1). Similar advantages accrue from 

datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, saccharides, 
lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that includes 
at least one set of assay data. The data contained in the database is acquired, e.g., using array 

20 analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on an electronic device allowing for the storage 
of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

25 The focus of the present section on databases that include peptide sequence data is for 

clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

The compositions and methods for identifying and/or quantitating the relative and/or 
absolute abundance of a variety of molecular and macromolecular species from a biological 

30 sample undergoing prostate cancer, i.e., the identification of prostate cancer-associated 

sequences described herein, provide an abundance of information, which can be correlated 
with pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, 
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gene-disease causal linkages, identification of correlates of immunity and physiological 
status, among others. Although the data generated from the assays of the invention is suited 
for manual review and analysis, in a preferred embodiment, prior data processing using high- 
speed computers is utilized. 
5 An array of methods for indexing and retrieving biomolecular information is known 

in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 
Patent 5,953,727 discloses a relational database having sequence records containing 

1 0 information in a format that allows a collection of partial-length DNA sequences to be 

catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

15 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

20 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

25 the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinformatics CSH Press; Durbin, et al. (eds. 1999) 
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids 
Cambridge Univ. Press; Baxevanis and Oeullette (eds., 1998) Bioinformatics: A Practical 
Guide to the Analysis of Genes and Proteins Wiley-Liss; Rashidi and Buehler (1999) 

3 0 Bioinformatics: Basic Applications in Biological Science and Medicine CRC Press ; Setubal, 
et al. (eds. 1997) Introduction to Computational Molecular Biology Brooks/Cole; Misener 
and Krawetz (eds. 2000) Bioinformatics: Methods and Protocols Human Press; Higgins and 
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Taylor (eds. 2000) Bioinformatics: Sequence, Structure, and Databanks: A Practical 
Approach Oxford Univ. Press; Brown (2001) Bioinformatics: A Biologist's Guide to 
Biocomputing and the Internet Eaton Pub; Han and Kamber (2000) Data Mining: Concepts 
and Techniques Kaufinann Pub.; and Waterman (1995) Introduction to Computational 
Biology: Maps, Sequences, and Genomes Chap and Hall. 

The present invention provides a computer database comprising a computer and 
software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 
is from a control tissue sample known to be free of pathological disorders. In a variation, at 
least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for prostate cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 
source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage apparatus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 
data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems built therewith, comprising a bit pattern 
encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
32 



WO 02/098358 



PCT/US02/17594 



comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 
(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 
SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 
Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention 
in a file format suitable for retrieval and processing in a computerized sequence analysis, 
comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 
linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 
line, wireless network, optical fiber, or other suitable signal transmission medium, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for comparing a 
query target to a database containing an array of data structures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 
Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 
SDRAM). Targets are ranked according to the degree of correspondence between a selected 
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assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
MPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 
5 molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 
data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 
SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 
be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 
adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 
10 device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which maybe stored in the computer; (3) a comparison target, such as a query target; and (4) 
15 a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 

20 proteins, transmembrane proteins, or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
proteins often results in unregulated or disregulated cellular processes (see, e.g., Alberts (ed. 

25 1 994) Molecular Biology of the Cell (3d ed.) Garland. For example, many intracellular 

proteins have enzymatic activity such as protein kinase activity, protein phosphatase activity, 
protease activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular 
proteins also serve as docking proteins that are involved in organizing complexes of proteins, 
or targeting proteins to various subcellular localizations, and are involved in maintaining the 

30 structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined functions have been attributed, hi 
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addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
5 domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 

10 sequence; thus, an analysis of the sequence of proteins may provide insight into both the 
enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St. Louis, the Sanger 

15 Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al. (2000) 
Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 
(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 
322. 

In another embodiment, the prostate cancer sequences are transmembrane proteins. 

20 Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 

25 domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane domains. 
30 For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels and adenylyl cyclases contain numerous 
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transmembrane domains. Many important cell surface receptors such as G protein coupled 
receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they contain 
7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
localization and number of transmembrane domains within the protein may be predicted (see, 
e.g., PSORT web site http://psort.nibb.ac.jp/). Important transmembrane protein receptors 
include, but are not limited to the insulin receptor, insulin-like growth factor receptor, human 
growth hormone receptor, glucose transporters, transferrin receptor, epidermal growth factor 
receptor, low density lipoprotein receptor, epidermal growth factor receptor, leptin receptor, 
and interleukin receptors, e.g., IL-1 receptor, IL-2 receptor, etc. 

The extracellular domains of transmembrane proteins are diverse; however, conserved 
motifs are found repeatedly among various extracellular domains. Conserved structure 
and/or functions have been ascribed to different extracellular motifs. Many extracellular 
domains are involved in binding to other molecules. In one aspect, extracellular domains are 
found on receptors. Factors that bind the receptor domain include circulating ligands, which 
may be peptides, proteins, or small molecules such as adenosine and the like. For example, 
growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 
cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, neurotrophic factors and the like. Extracellular domains also bind to cell- 
associated molecules. In this respect, they mediate cell-cell interactions. Cell-associated 
ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) anchor, or 
may themselves be transmembrane proteins. Extracellular domains also associate with the 
extracellular matrix and contribute to the maintenance of the cell structure. 

Prostate cancer proteins that are transmembrane are particularly preferred in the 
present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 
typically permeablized to provide access to intracellular proteins.. In addition, some 
membrane proteins can be processed to release a soluble protein, or to expose a residual 
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fragment. Released soluble proteins may be useful diagnostic markers, processed residual 
protein fragments may be useful prostate markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

In another embodiment, the prostate cancer proteins are secreted proteins; the 
secretion of which can be either constitutive or regulated. These proteins may have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they often serve to transmit signals to various other cell types. The secreted protein may 
function in an autocrine manner (acting on the cell that secreted the factor), a paracrine 
manner (acting on cells in close proximity to the cell that secreted the factor), an endocrine 
maimer (acting on cells at a distance, e.g, secretion into the blood stream), or an exocrine 
manner (secretion, e.g., through a duct or to adjacent epithelial surface as sweat glands, 
sebaceous glands, pancreatic ducts, lacrimal glands, mammary glands, sax producing glands 
of the ear, etc.). Thus secreted molecules find use in modulating or altering numerous aspects 
of physiology. Prostate cancer proteins that are secreted proteins are particularly preferred in 
the present invention as they serve as good targets for diagnostic markers, e.g., for blood, 
plasma, serum, or stool tests. Those which are enzymes may be antibody or small molecule 
targets. Others may be useful as vaccine targets, e.g., via CTL mechanisms. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by substantial 
nucleic acid and/or amino acid sequence homology or linkage to the prostate cancer 
sequences outlined herein. Such homology can be based upon the overall nucleic acid or 
amino acid sequence, and is generally determined as outlined below, using either homology 
programs or hybridization conditions. Typically, linked sequences on a mRNA are found on 
the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the sequences in 
Tables 1A-4, can be fragments of larger genes, i.e., they are nucleic acid segments. "Genes" 
in this context includes coding regions, non-coding regions, and mixtures of coding and non- 
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coding regions. Accordingly, as will be appreciated by those in the art, using the sequences 
provided herein, extended sequences, in either direction, of the prostate cancer genes can be 
obtained, using techniques well known in the art for cloning either longer sequences or the 
full length sequences; see Ausubel, et al., supra. Much can be done by informatics and many 
sequences can be clustered to include multiple sequences corresponding to a single gene, e.g., 
systems such as UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if necessary, 
its constituent parts recombined to form the entire prostate cancer nucleic acid coding regions 
or the entire mRNA sequence. Once isolated from its natural source, e.g., contained within a 
plasmid or other vector or excised therefrom as a linear nucleic acid segment, the 
recombinant prostate cancer nucleic acid can be further-used as a probe to identify and isolate 
other prostate cancer nucleic acids, e.g., extended coding regions. It can also be used as a 
"precursor" nucleic acid to make modified or variant prostate cancer nucleic acids and 
proteins. 

The prostate cancer nucleic acids of the present invention are used in several ways. In 
a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are made and 
attached to biochips to be used in screening and diagnostic methods, as outlined below, or for 
administration, e.g., for gene therapy, vaccine, and/or antisense applications. Alternatively, 
the prostate cancer nucleic acids that include coding regions of prostate cancer proteins can 
be put into expression vectors for the expression of prostate cancer proteins, again for 
screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic acids (both 
the nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be base pair mismatches 
which will interfere with hybridization between the target sequence and the single stranded 
nucleic acids of the present invention. However, if the number of mutations is so great that 
no hybridization can occur under even the least stringent of hybridization conditions, the 
sequence is not a complementary target sequence. Thus, by "substantially complementary" 
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herein is meant that the probes are sufficiently complementary to the target sequences to 
hybridize under normal reaction conditions, particularly high stringency conditions, as 
outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 
5 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence, hi general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 

10 hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with either 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redundancy for a 
particular target. The probes can be overlapping (i.e., have some sequence in common), or 

15 separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 
As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 
equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 

20 removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 
attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 

25 equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 

30 covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as will be 
appreciated by those in the art. As described herein, the nucleic acids can either be 
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synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified to contain 
discrete individual sites appropriate for the attachment or association of the nucleic acid 
probes and is amenable to at least one detection method. As will be appreciated by those in 
the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, Teflon!, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 
plastics, etc. In general, the substrates allow optical detection and do not appreciably 
fluoresce. A preferred substrate is described in WO0055627, herein incorporated by 
reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 
placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

hi a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 
homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). hi addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, and then 
attached to the surface of the solid support. As will be appreciated by those skilled in the art, 
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either the 5' or 3 ' terminus may be attached to the solid support, or attachment may be via an 
internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 
yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to 
5 surfaces covalently coated with streptavidin, resulting in attachment. 

Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 
the art. For example, photoactivation techniques utilizing photopolymerization compounds 
and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 
situ, using well known photolithographic techniques, such as those described in WO 

10 95/251 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited 
within, all of which are expressly incorporated by reference; these methods of attachment 
form the basis of the Affymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 
prostate cancer-associated sequences. These assays are typically performed in conjunction 

1 5 with reverse transcription. In such assays, a prostate cancer-associated nucleic acid sequence 
acts as a template in an amplification reaction (e.g., Polymerase Chain Reaction, or PCR). In 
a quantitative amplification, the amount of amplification product will be proportional to the 
amount of template in the original sample. Comparison to appropriate controls provides a 
measure of the amount of prostate cancer-associated RNA. Methods of quantitative 

20 amplification are well known to those of skill in the art. Detailed protocols for quantitative 
PCR are provided, e.g., in huiis, et al. (1990) PCR Protocols: A Guide to Methods and 
Applications Academic Press. 

hi some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 

25 dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3 ' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

30 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 
Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560-569, Landegren, et al. (1988) 
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Science 241:1077-1080, and Barringer, etal. (1990) Gene 89:117-122), transcription 
amplification (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173-1177), self-sustained 
sequence replication (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87:1874-1878), dot 
PCR, and linker adapter PCR, etc. 

5 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding prostate 
cancer proteins are used to make a variety of expression vectors to express prostate cancer 
proteins which can then be used in screening assays, as described below. Expression vectors 

10 and recombinant DNA technology are well known to those of skill in the art (see, e.g., 

Ausubel, supra, and Fernandez and Hoeffler (eds. 1999) Gene Expression Systems Academic 
Press) and are used to express proteins. The expression vectors may be either self-replicating 
extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 

15 linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 
promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 

20 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; a ribosome binding site is operably 

25 linked to a coding sequence if it is positioned so as to facilitate translation, and sequences 
may be operably linked when they are physically linked on the same molecule. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 

30 sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in 

accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to express the prostate cancer protein. 
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Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 
not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences encode either constitutive or inducible promoters. The promoters 
may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known in the art, and are useful in the 
present invention. 

hi addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector contains at least one sequence homologous to the host cell genome, and preferably two 
homologous sequences which flank the expression construct. The integrating vector may be 
directed to a specific locus in the host cell by selecting the appropriate homologous sequence 
for inclusion in the vector. Constructs for integrating vectors are well known in the art (e.g., 
Fernandez and Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 
marker gene to allow the selection of transformed host cells. Selection genes are well known 
in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing a host 
cell transformed with an expression vector containing nucleic acid encoding a prostate cancer 
protein, under the appropriate conditions to induce or cause expression of the prostate cancer 
protein. Conditions appropriate for prostate cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction, hi addition, in some embodiments, the timing of the harvest 
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is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 
5 and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THP1 cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the prostate cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 
10 adenoviral systems. One expression vector system is a retroviral vector system such as is 
generally described in PCT/US97/01019 and PCT/US97/01048, both of which are hereby 
expressly incorporated by reference. Of particular use as mammalian promoters are the 
promoters from mammalian viral genes, since the viral genes are often highly expressed and 
have a broad host range. Examples include the SV40 early promoter, mouse mammary tumor 
15 virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, and the 
CMV promoter (see, e.g., Fernandez and Hoeffier, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
regions located 3' to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenylation signals 
20 include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 
25 polynucleotide(s) in liposomes, and direct microinj ection of the DNA into nuclei. 

In a preferred embodiment, prostate cancer proteins are expressed in bacterial 
systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 
30 promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
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binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 
between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 
such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 
components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez and Hoeffler, supra). The 
bacterial expression vectors are transformed into bacterial host cells using techniques well 
known in the ait, such as calcium chloride treatment, electroporation, and others. 

In one embodiment, prostate cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. Yeast 
expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using techniques 
well known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the desired 
epitope is small, the prostate cancer protein may be fused to a carrier protein to form an 
immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 
acid for expression purposes. 

In a preferred embodiment, the prostate cancer protein is purified or isolated after 
expression. Prostate cancer proteins may be isolated or purified in a variety of ways known 
to those skilled in the art depending on what other components are present in the sample. 
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Standard purification methods include electrophoretic, molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
5 Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes (1982) Protein 
Purification Springer-Verlag. The degree of purification necessary will vary depending on 
the use of the prostate cancer protein. In some instances no purification will be necessary. 
Once expressed and purified if necessary, the prostate cancer proteins and nucleic 
10 acids are useful in a number of applications. They may be used as immunoselection reagents, 
as vaccine reagents, as screening agents, etc. 

Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant prostate 
15 cancer proteins as compared to the wild-type sequence. That is, as outlined more fully below, 
the derivative prostate cancer peptide will often contain at least one amino acid substitution, 
deletion or insertion, with amino acid substitutions being particularly preferred. The amino 
acid substitution, insertion, or deletion may occur at most any residue within the prostate 
cancer peptide. 

20 Also included within one embodiment of prostate cancer proteins of the present 

invention are amino acid sequence variants. These variants typically fall into one or more of 
three classes: substitutional, insertional, or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the prostate cancer 
protein, using cassette or PCR mutagenesis or other techniques well known in the art, to 

25 produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell 
culture as outlined above. However, variant prostate cancer protein fragments having up to 
about 100-150 residues may be prepared by in vitro synthesis using established techniques. 
Amino acid sequence variants are characterized by the predetermined nature of the variation, 
a feature that sets them apart from naturally occurring allelic or interspecies variation of the 

30 prostate cancer protein amino acid sequence. The variants typically exhibit the same 

qualitative biological activity as the naturally occurring analogue, although variants can also 
be selected which have modified characteristics as will be more fully outlined below. 
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While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

Amino acid substitutions are typically of single residues; insertions usually will be on 
the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
tolerated. Deletions range from about 1 to about 20 residues, although in some cases 
deletions may be much larger. 

Substitutions, deletions, insertions or a combination thereof may be used to arrive at a 
final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will elicit 
the same immune response as the naturally-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by selecting 
substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 
polypeptide's properties are those in which (a) a hydrophilic residue, e.g., serinyl or threonyl 
is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) another residue; (c) a residue having 
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an electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, 
e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 

Covalent modifications of prostate cancer polypeptides are included within the scope 
of this invention. One type of covalent modification includes reacting targeted amino acid 
residues of a prostate cancer polypeptide with an organic deriv arizing agent that is capable of 
reacting with selected side chains or the N-or C-terminal residues of a prostate cancer 
polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking 
prostate cancer polypeptides to a water-insoluble support matrix or surface for use in the 
method for purifying anti-prostate cancer polypeptide antibodies or screening assays, as is 
more fully described below. Commonly used crosslinking agents include, e.g., 1,1- 
bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters 
with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl esters 
such as 3,3'-dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N- 
maleimido- 1,8 -octane and agents such as methyl-3-((p-azidophenyl)dithio)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl residues to 
the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the amino groups of the lysine, arginine, and histidine side chains (e.g., pp. 
79-86, Creighton (1983) Proteins: Structure and Molecular Properties Freeman), acetylation 
of the N-terminal amine, and amidation of a C-terminal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide included 
within the scope of this invention comprises altering the native glycosylation pattern of the 
polypeptide. "Altering the native glycosylation partem" is intended for purposes herein to 
mean deleting one or more carbohydrate moieties found in native sequence prostate cancer 
polypeptide, and/or adding one or more glycosylation sites that are not present in the native 
sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many ways. 
For example the use of different cell types to express prostate cancer-associated sequences 
can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration maybe made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
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sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 
amino acid sequence may optionally be altered through changes at the DNA level, 
particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 
5 Another means of increasing the number of carbohydrate moieties on the prostate 

cancer polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. 
Such methods are described in the art, e.g., in WO 87/05330, and pp. 259-306 in Aplin and 
Wriston (1981) CRC Crit. Rev. Biochem. 

Removal of carbohydrate moieties present on the prostate cancer polypeptide may be 

1 0 accomplished chemically or enzymatically or by mutational substitution of codons encoding 
for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation 
techniques are known in the art and described, e.g., by Hakimuddin, et al. (1987) Arch. 
Biochem. Biophvs. 259:52-57: and Edge, et al. Q981) Anal. Biochem. 118:131-137. 
Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

15 variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 
Enzvmol . 138:350-359. 

Another type of covalent modification of prostate cancer comprises linking the 
prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 
polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 

20 U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192; or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in a way 
to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 

25 provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 
an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 

30 antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
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with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 
the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the art. 
Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 
5 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. 
(1988) Mol. Cell. Biol . 8:2159-2165; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7, and 
9E10 antibodies thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky, et al. 
(1990) Protein Engineering 3:547-553). Other tag polypeptides include the Flag-peptide 

10 (Hopp, et al. (1 988) BioTechnology 6: 1 204-121 0); the KT3 epitope peptide (Martin, et al. 
(1992) Science 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 
266:15163-15166); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth, et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:6393-6397). 

Also included are other prostate cancer proteins of the prostate cancer family, and 

15 prostate cancer proteins from other organisms, which are cloned and expressed as outlined 
below. Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences may be 
used to find other related prostate cancer proteins from humans or other organisms. As will 
be appreciated by those in the art, particularly useful probe and/or PCR primer sequences 
include the unique areas of the prostate cancer nucleic acid sequence. As is generally known 

20 in the art, preferred PCR primers are from about 15 to about 35 nucleotides in length, with 
from about 20 to about 30 being preferred, and may contain inosine as needed. The 
conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols , supra). 

Antibodies to prostate cancer proteins 

25 In a preferred embodiment, when the prostate cancer protein is to be used to generate 

antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein should 
share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 

30 made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes, hi a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 
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Methods of preparing polyclonal antibodies are known to the skilled artisan (e.g., 
Coligan, supra; and Harlow and Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an mimunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 
5 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucieic acid of the figures or fragment thereof or a fusion protein thereof It 
may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyro globulin, and soybean 

1 0 trypsin inhibitor. Examples of adjuvants which may be employed include Freund' s complete 
adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 
dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 

15 may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lymphocytes may be immunized in vitro. The immunizing agent will 

20 typically include a polypeptide encoded by a nucleic acid of Tables 1A-4 or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes ("PBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 
human mammalian sources are desired. The lymphocytes are then fused with an 
immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 

25 hybridoma cell (see pp. 59-103 in Goding (1 986) Monoclonal Antibodies: Principles and 

Practice Academic Press). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 

30 the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

hypoxanthine guanine phosphoribosyl transferase (HGPRT or FfPRT), the culture medium 
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for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

hi one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
monoclonal, preferably human or humanized, antibodies that have binding specificities for at 
least two different antigens or that have binding specificities for two epitopes on the same 
antigen, hi one embodiment, one of the binding specificities is for a protein encoded by a 
nucleic acid of Tables 1 A-4 or a fragment thereof, the other one is for another antigen, and 
preferably for a cell-surface protein or receptor or receptor subunit, preferably one that is 
tumor specific. Alternatively, tetramer-type technology may create multivalent reagents. 

In a preferred embodiment, the antibodies to prostate cancer protein are capable of 
reducing or eliminating a biological function of a prostate cancer protein, as is described 
below. That is, the addition of anti-prostate cancer protein antibodies (either polyclonal or 
preferably monoclonal) to prostate cancer tissue (or cells containing prostate cancer) may 
reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in activity, 
growth, size or the like is preferred, with at least about 50% being particularly preferred and 
about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences; Medarex, Inc.; Abgenix, Inc.; Protein 
Design Labs, Inc.). Humanized forms of non-human (e.g., murine) antibodies are cliimeric 
molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity, hi some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
substantially all of at least one, and typically two, variable domains, in which all or 
substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
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immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. (1988) Nature 
332:323-329; and Presta (1992) Curr. Op. Struct. Biol. 2:593-596). Humanization can be 
essentially performed following methods of Winter and co-workers (see, e.g., Jones, et al. 
(1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-327; and Verhoeyen, et 
al. (1988) Science 239:1534-1536), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the art, 
including phage display libraries (Hoogenboom and Winter (1991) J. Mol. Biol . 227:381- 
388; Marks, et al. (1991) J. Mol. Biol . 222:581-597) or the preparation of human monoclonal 
antibodies (e.g., p77 in Cole, et al. (1985) Monoclonal Antibodies and Cancer Therapy Liss; 
and Boerner, et al. (1991) J. Immunol. 147(l):86-95). Similarly, human antibodies can be 
made by introducing of human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous immunoglobulin genes have been partially or completely inactivated. 
Upon challenge, human antibody production is observed, which closely resembles that seen 
in humans in most respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 
5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. 
(1992) Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison 
(1994) Nature 368:812-13; Fishwild, etal. (1996) Nature Biotechnology 14:845-51; 
Neuberger (1996) Nature Biotechnology 14:826; Lonberg and Huszar (1995) Intern. Rev. 
Immunol. 13:65-93. 

By immunotherapy is meant treatment of prostate cancer with an antibody raised 
against prostate cancer proteins. As used herein, immunotherapy can be passive or active. 
Passive immunotherapy as defined herein is the passive transfer of antibody to a recipient 
(patient). Active immunization is the induction of antibody and/or T-cell responses in a 
recipient (patient). Induction of an immune response is the result of providing the recipient 
with an antigen to which antibodies are raised. As appreciated by one of ordinary skill in the 
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art, the antigen may be provided by injecting a polypeptide against which antibodies are 
desired to be raised into a recipient, or contacting the recipient with a nucleic acid capable of 
expressing the antigen and under conditions for expression of the antigen, leading to an 
immune response. 

5 In a preferred embodiment the prostate cancer proteins against which antibodies are 

raised are secreted proteins as described above. Without being bound by theory, antibodies 
used for treatment, bind and prevent the secreted protein from binding to its receptor, thereby 
inactivating the secreted prostate cancer protein. 

ha another preferred embodiment, the prostate cancer protein to which antibodies are 
1 0 raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment bind the extracellular domain of the prostate cancer protein and prevent it from 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
15 competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also often an antagonist of the prostate cancer 
protein. Further, the antibody may prevent activation of the transmembrane prostate cancer 
protein. In one aspect, when the antibody prevents the binding of other molecules to the 
prostate cancer protein, the antibody prevents growth of the cell. The antibody may also be 
20 used to target or sensitize the cell to cytotoxic agents, including, but not limited to TNF-cc, 
TNP-P, IL-1, ESTF-y, and IL-2, or chemotherapeutic agents including 5FU, vinblastine, 
actinomycin D, cisplatin, methotrexate, and the like. In some instances the antibody belongs 
to a sub-type that activates serum complement when complexed with the transmembrane 
protein thereby mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, 
25 prostate cancer is treated by administering to a patient antibodies directed against the 

transmembrane prostate cancer protein. Antibody-labeling may activate a co-toxin, localize a 
toxin payload, or otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be a labeling moiety such as a radioactive label or fluorescent label, 
30 or can be a therapeutic moiety. In one aspect the therapeutic moiety is a small molecule that 
modulates the activity of the prostate cancer protein. In another aspect the therapeutic moiety 
modulates the activity of molecules associated with or in close proximity to the prostate 
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cancer protein. The therapeutic moiety may inhibit enzymatic activity such as protease or 
collagenase or protein kinase activity associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent, hi 
this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results in a 
reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic 
agents also include radiochemicals made by conjugating radioisotopes to antibodies raised 
against prostate cancer proteins, or binding of a radionuclide to a chelating agent that has 
been covalently attached to the antibody. Targeting the therapeutic moiety to transmembrane 
prostate cancer proteins not only serves to increase the local concentration of therapeutic 
moiety in the prostate cancer afflicted area, but also serves to reduce deleterious side effects, 
e.g., by binding to normal tissues, that may be associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 
the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate cancer 
proteins. By "specifically bind" herein is meant that the antibodies bind to the protein with a 
Kd of at least about 0.1 mM, more usually at least about 1 uM, preferably at least about 0.1 
uM or better, and most preferably, 0.01 uM or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA expression levels of genes are determined for different 
cellular states in the prostate cancer phenotype. After androgen ablation therapy, cells that 
survive the therapy undergo a period of quiescence followed at sometime later by active cell 
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division. As explained above, there are a variety of expression patterns characteristic of the 
prostate cancer genes involved in androgen-independent prostate cancer. Some genes are 
expressed early in the time course following ablation therapy, then drop off in expression, 
and then express again with emergence of androgen-independence (hi-lo-hi pattern in 1 A). 
5 Other genes are expressed early in the time course following ablation therapy, then drop off 
in expression, and do not express again with emergence of androgen-independence (hi-lo-lo 
pattern in Table 1 A). Still other genes are not expressed early in the time course, but express 
only with emergence of androgen-independence (lo-lo-hi pattern in Table 1A). Other genes 
are not expressed early in the time course, but then express as androgen is withdrawn and 

1 0 continue to express with emergence of androgen-independence (lo-hi-hi pattern in Table 1 A). 
Finally, some genes are not expressed early in the time course, but then express as androgen 
is withdrawn and drop off again with emergence of androgen-independence (lo-hi-lo pattern 
in Table 1 A). Thus, the data suggest that different antigens are expressed in quiescent cells 
and actively dividing androgen-independent prostate cancer cells. 

15 In another aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. After androgen ablation therapy, cells that 
survive the therapy undergo a period of quiescence followed at sometime later by active cell 
division. As explained above, there are a variety of expression patterns characteristic of the 
prostate cancer genes involved in androgen-independent prostate cancer. Some genes are 

20 expressed early in the time course following ablation therapy, then drop off in expression, 
and then express again with emergence of androgen-independence (hi-lo-lo-hi pattern in 
Table 2A). Other genes are expressed early in the time course following ablation therapy, 
then drop off in expression, and do not express again with emergence of androgen- 
independence (hi-lo-lo-lo and hi-hi-lo-lo pattern in Table 2A). Still other genes are not 

25 expressed early in the time course, but express only with emergence of androgen- 
independence (lo-lo-lo-hi pattern in Table 2A). Other genes are not expressed early in the 
time course, but then express as androgen is withdrawn and continue to express with 
emergence of androgen-independence (lo-lo-hi-hi pattern in Table 2A). Finally, some genes 
are not expressed early in the time course, but then express as androgen is withdrawn and 

30 drop off again with emergence of androgen-independence (lo-lo-hi-lo pattern in Table 2A). 
Thus, the data suggest that different antigens are expressed in quiescent cells (during 
androgen withdrawal) and actively dividing androgen-independent prostate cancer cells. 
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Effective therapy to combat androgen-independent prostate cancer requires that the 
timing of therapy coincide with expression of the target genes. Patients can be monitored for 
the expression of certain diagnostic antigens that indicate the presence of quiescent cells or 
which indicate the transition to actively dividing androgen-independent prostate cancer cells. 
Thus, therapy to combat androgen-independent prostate cancer should begin at some time 
following androgen ablation therapy, depending on the particular target. Typically the 
transition from quiescence to actively dividing androgen-independent prostate cancer occurs 
between 6-24 months following androgen ablation therapy. Thus, preferred time periods for 
the therapies of the invention are as follows: 

Expression levels of genes in normal tissue (i.e., not undergoing prostate cancer) and 
in prostate cancer tissue (and in some cases, for varying severities of prostate cancer that 
relate to prognosis, as outlined below) or in non-malignant disease are evaluated to provide 
expression profiles. An expression profile of a particular cell state or point of development is 
essentially a "fingerprint" of the state. While two states may have a particular gene similarly 
expressed, the evaluation of a number of genes simultaneously allows the generation of a 
gene expression profile that is reflective of the state of the cell. By comparing expression 
profiles of cells in different states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. Then, 
diagnosis may be performed or confirmed to determine whether a tissue sample has the gene 
expression profile of normal or cancerous tissue. This will provide for molecular diagnosis 
of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression patterns 
within and among cells and tissue. Thus, a differentially expressed gene can qualitatively 
have its expression altered, including an activation or inactivation, in, e.g., normal versus 
prostate cancer tissue. Genes may be turned on or turned off in a particular state, relative to 
another state thus permitting comparison of two or more states. A qualitatively regulated 
gene will exhibit an expression pattern within a state or cell type which is detectable by 
standard techniques. Some genes will be expressed in one state or cell type, but not in both. 
Alternatively, the difference in expression may be quantitative, e.g., in that expression is 
increased or decreased; i.e., gene expression is either upregulated, resulting in an increased 
amount of transcript, or downregulated, resulting in a decreased amount of transcript. The 
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degree to which expression differs need only be large enough to quantify via standard 
characterization techniques as outlined below, such as by use of Affymetrix GeneChip™ 
expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby expressly 
incorporated by reference. Other techniques include, but are not limited to, quantitative 
reverse transcriptase PCR, northern analysis and RNase protection. As outlined above, 
preferably the change in expression (i.e., upregulation or downregulation) is at least about 
50%, more preferably at least about 100%, more preferably at least about 150%, more 
preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 

Evaluation may be at the gene transcript, or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the DNA or RNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 
gene product itself (protein) can be monitored, e.g., with antibodies to the prostate cancer 
protein and standard immunoassays (ELISAs, etc.) or other techniques, including mass 
spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to prostate 
cancer genes, i.e., those identified as being important in a prostate cancer or disease 
phenotype, can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed simultaneously 
on a number of genes. Multiple protein expression monitoring can be performed as well. 
Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of prostate cancer sequences in a 
particular cell. The assays are further described below in the example. PCR techniques can 
be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein are 
detected. Although DNA or RNA encoding the prostate cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 
complementary to and hybridizes with the mRNA and includes, but is not limited to, 
oligonucleotides, cDNA, or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 
58 



WO 02/098358 



PCT/US02/17594 



detected. In another method detection of the mRNA is performed in situ (in situ 
hybridization or ISH). In this method permeabilized cells or tissue samples are contacted 
with a detectably labeled nucleic acid probe for sufficient time to allow the probe to hybridize 
with the target mRNA. Following washing to remove the non-specifically bound probe, the 
label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that is 
complementary to the mRNA encoding a prostate cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins as 
described herein (secreted, transmembrane, or intracellular proteins) are used in diagnostic . 
assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. Such may evaluate 
tissues, e.g., immunohistochemistry, or evaluate body fluids, e.g., blood. The detection may 
be direct of cells, or indirect, e.g., of products from cells. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, prostate cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as prognostic or diagnostic markers of prostate 
cancer or other prostate conditions. Detection of these proteins in putative prostate cancer 
tissue allows for detection, diagnosis, or prognosis of prostate proliferative disorders 
(malignant and non-malignant) including benign prostate hyperplasia (BPH) and cancer, and 
prostatitis. Diagnosis may also assist in selecting a therapeutic strategy, e.g., based on 
expression profiles and/or comparison to archival samples. In one embodiment, antibodies 
are used to detect prostate cancer proteins, directly or indirectly. A preferred method 
separates proteins from a sample by electrophoresis on a gel (typically a denaturing and 
reducing protein gel, but may be another type of gel, including isoelectric focusing gels and 
the like). Following separation of proteins, the prostate cancer protein is detected, e.g., by 
immunoblotting with antibodies raised against the prostate cancer protein. Methods of 
immunoblotting are well known to those of ordinary skill in the art. 
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In another preferred method, antibodies to the prostate cancer protein find use in in 
situ imaging techniques, e.g., in histology and/or in immunohistochemistry (e.g., Asai (ed. 
1993) Methods in Cell Biology: Antibodies in Cell Biology (vol. 37) Academic Press. In this 
method cells are contacted with from one to many antibodies to the prostate cancer protein(s). 
5 Following washing to remove non-specific antibody binding, the presence of the antibody or 
antibodies is detected. In one embodiment the antibody is detected by incubating with a 
secondary antibody that contains a detectable label. In another method the primary antibody 
to the prostate cancer protein(s) contains a detectable label, e.g., an enzyme marker that can 
act on a substrate. In another preferred embodiment each one of multiple primary antibodies 
10 contains a distinct and detectable label. This method finds particular use in simultaneous 
screening for a plurality of prostate cancer proteins. As will be appreciated by one of 
ordinary skill in the art, many other histological imaging techniques are also provided by the 
invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 

15 to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 
activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate cancer 
from blood, serum, plasma, stool, and other samples. Such samples, therefore, are useful as 
samples to be probed or tested for the presence of prostate cancer proteins, which may be 

20 diagnostic of prostate conditions beyond cancer, e.g., BPH. Antibodies can be used to detect 
a prostate cancer protein by previously described immunoassay techniques including ELISA, 
inmiunoblotting (western blotting), immunoprecipitation, BIACORE technology, and the 
like. Conversely, the presence of antibodies may indicate an immune response against an 
endogenous prostate cancer protein. 

25 In a preferred embodiment, in situ hybridization of labeled prostate cancer nucleic 

acid probes to tissue arrays is done. For example, arrays of tissue samples, including prostate 
cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) 
is then performed. "When comparing the fingerprints between an individual and a standard, 
the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It 

30 is further understood that the genes which indicate the diagnosis may differ from those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or refractory conditions or may be predictive of outcomes. 
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In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic acids, 
modified proteins, and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer 
or other prostate disorders, in terms of useful aspects of clinical condition, pathology, or other 
5 information which may be relevant to long term prognosis. Again, this may be done on either 
a protein or gene level, with the use of genes being preferred. Single or multiple genes may 
be useful in various combinations. As above, prostate cancer probes may be attached to 
biochips for the detection and quantification of prostate cancer sequences in a tissue or 
patient. The assays proceed as outlined above for diagnosis. PCR method may provide more 
10 sensitive and accurate quantification. 

Assays for therapeutic compounds 

In a preferred embodiment members of the proteins, nucleic acids, and antibodies as 
described herein are used in drug screening assays. The prostate cancer proteins, antibodies, 

15 nucleic acids, modified proteins, and cells containing prostate cancer sequences are used in 
drug screening assays or by evaluating the effect of drug candidates on a "gene expression 
profile" or expression profile of polypeptides. In a preferred embodiment, the expression 
profiles are used, preferably in conjunction with high throughput screening techniques to 
allow monitoring for expression profile genes after treatment with a candidate agent (e.g., 

20 Zlokarnik, et al. (1998) Science 279:84-88; Heid (1996) Genome Res. 6:986-94). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic acids, 
modified proteins, and cells containing the native or modified prostate cancer proteins are 
used in screening assays. That is, the present invention provides novel methods for screening 
for compositions which modulate the prostate cancer phenotype or an identified physiological 

25 function of a prostate cancer protein. As above, this can be done on an individual gene level 
or by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 
embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokarnik, supra. 

30 Having identified the differentially expressed genes herein, a variety of assays may be 

executed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene as up regulated in prostate cancer, test 
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compounds can be screened for the ability to modulate gene expression or for binding to the 
prostate cancer protein. "Modulation" thus includes both an increase and a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of the 
gene expression in normal versus tissue undergoing prostate cancer, with changes of at least 
5 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue compared to 
normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in 
prostate cancer tissue compared to normal tissue often provides a target value of a 10-fold 
increase in expression to be induced by the test compound. 
10 The amount of gene expression may be monitored using nucleic acid probes and the 

quantification of gene expression levels, or, alternatively, the gene product itself can be 
monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 
immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

15 hi a preferred embodiment, gene expression or protein monitoring of a number of 

entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 

involve a plurality of those entities described herein. 

In this embodiment, the prostate cancer nucleic acid probes are attached to biochips as 

outlined herein for the detection and quantification of prostate cancer sequences in a 
20 particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microliter plate, may 

be used with dispensed primers in desired wells. A PCR reaction can then be performed and 

analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify the 

expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
25 sequence set out in Tables 1 A-4. Generally, in a preferred embodiment, a test modulator is 

added to the cells prior to analysis. Moreover, screens are also provided to identify agents 

that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 

protein, or interfere with the binding of a prostate cancer protein and an antibody or other 

binding partner. 

30 The term "test compound" or "drug candidate" or "modulator" or grammatical 

equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 
62 



WO 02/098358 



PCT/US02/17594 



indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g., to a normal or non- 
malignant tissue fingerprint. In another embodiment, a modulator induced a prostate cancer 
phenotype. Generally, a plurality of assay mixtures are run in parallel with different agent 
concentrations to obtain a differential response to the various concentrations. Typically, one 
of these concentrations serves as a negative control, i.e., at zero concentration or below the 
level of detection. 

Drug candidates encompass numerous chemical classes, though typically they are 
organic molecules, preferably small organic compounds having a molecular weight of more 
than 100 and less than about 2,500 daltons. Preferred small molecules are less than 2000, or 
less than 1500, or less than 1000, or less than 500 D. Candidate agents comprise functional 
groups necessary for structural interaction with proteins, particularly hydrogen bonding, and 
typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least 
two of the functional chemical groups. The candidate agents often comprise cyclical carbon 
or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or 
more of the above functional groups. Candidate agents are also found among biomolecules 
including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, 
structural analogs, or combinations thereof. Particularly preferred are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer protein. By 
"neutralize" is meant that activity of a protein is inhibited or blocked and the consequent 
effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 
Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 
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compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
display a desired characteristic activity. The compounds thus identified can serve as 
conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical compounds 
generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 
library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical 
building blocks called amino acids in most every possible way for a given compound length 
(i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks. Gallop, et al. (1994) J. Med. Chem. 37:1233-1251. 

Preparation and screening of combinatorial chemical libraries is well known to those 
of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pept. Prot. Res. 37:487- 
493, Houghton, et al. (1991) Nature . 354:84-88), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
1 14:6568-xxx), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding 
(Hirschmann, et al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic 
syntheses of small compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 1 16:2661- 
xxx), oligocarbamates (Cho, et al. (1993) Science 261:1303-1305), and/or peptidyl 
phosphonates (Campbell, et al. (1994) J. Org. Chem. 59:658-xxx). See, generally, Gordon, et 
al. (1994) J. Med. Chem. 37:1385-1401), nucleic acid libraries (see, e.g., Stratagene, Corp.), 
peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., 
Vaughn, et al. (1996) Nature Biotechnology 14:309-314, and PCT/US96/10287), 
carbohydrate libraries (see, e.g., Liang, et al. (1996) Science 274:1520-1522, and U.S. Patent 
No. 5,593,853), and small organic molecule libraries (see, e.g., benzodiazepines, Baum 
(1993) C&EN. Jan 18, page 33; isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and 
metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 
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5,519,134; morpholino compounds, U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent 
No. 5,288,514; and the like). 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 
5 Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

10 systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, 
Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations performed 
by a chemist. Many of the above devices are suitable for use with the present invention. The 
nature and implementation of modifications to these devices (if any) so that they can operate 
as discussed herein will be apparent to persons skilled in the relevant art. In addition, 

1 5 numerous combinatorial libraries are themselves commercially available (see, e.g., 

ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, MO, ChemStar, 
Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, Columbia, MD, 
etc.). 

The assays to identify modulators are amenable to high throughput screening. 
20 Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other properties 
of particular nucleic acids or protein products are well known to those of skill in the art. 

25 Similarly, binding assays and reporter gene assays are similarly well known. Thus, e.g., U.S. 
Patent No. 5,559,410 discloses high throughput screening methods for proteins, U.S. Patent 
No. 5,585,639 discloses high throughput screening methods for nucleic acid binding (i.e., in 
arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high throughput methods of 
screening for ligand/antibody binding. 

30 In addition, high throughput screening systems are commercially available (see, e.g., 

Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
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typically automate entire procedures, including sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 
fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 
useful test compound will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 30 
amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 
these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may typically incorporate any nucleotide or amino acid at any position. The synthetic 
process can be designed to generate randomized proteins or nucleic acids, to allow the 
formation of all or most of the possible combinations over the length of the sequence, thus 
forming a library of randomized candidate bioactive proteinaceous agents. 

In one embodiment, the library is fully randomized, with no sequence preferences or 
constants at any position. In a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected from a limited number 
of possibilities. For example, in a preferred embodiment, the nucleotides or amino acid 
residues are randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic 
residues, sterically biased (either small or large) residues, towards the creation of nucleic acid 
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binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, 
serines, threonines, tyrosines, or histidines for phosphorylation sites, etc., or to purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined above. 

As described above generally for proteins, nucleic acid modulating agents may be 
naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
For example, digests of prokaryotic or eukaryotic genomes may be used as is outlined above 
for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

After the candidate agent has been added and the cells allowed to incubate for some 
period of time, the sample containing a target sequence to be analyzed is added to the 
biochip. If required, the target sequence is prepared using known techniques. For example, 
the sample may be treated to lyse the cells, using known lysis buffers, electroporation, etc., 
with purification and/or amplification such as PCR performed as appropriate. For example, 
an in vitro transcription with labels covalently attached to the nucleotides is performed. 
Generally, the nucleic acids are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 
alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 
is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct hybridization 
assays or can comprise "sandwich assays", which include the use of multiple probes, as is 
generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 5,594,1 17, 
5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 5,594,118, 
5,359,100, 5,124,246, and 5,681,697, each of which is hereby incorporated by reference. In 
this embodiment, in general, the target nucleic acid is prepared as outlined above, and then 
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added to the biochip comprising a plurality of nucleic acid probes, under conditions that 
allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 
high, moderate, and low stringency conditions as outlined above. The assays are generally 
run under stringency conditions which allows formation of the label probe hybridization 
complex only in the presence of target. Stringency can be controlled by altering a step 
parameter that is a thermodynamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined in U.S. Patent No. 5,68 1 ,697. Thus it may be desirable to perform certain steps at 
higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, with 
preferred embodiments outlined below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc., 
which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 
expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer or related 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 
differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 
and/or modulate the biological activity of the gene product. 
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In addition screens can be done for genes that are induced in response to a candidate 
agent. After identifying a modulator based upon its ability to suppress a prostate cancer 
expression pattern leading to a normal expression pattern, or to modulate a single prostate 
cancer gene expression profile so as to mimic the expression of the gene from normal tissue, 
a screen as described above can be performed to identify genes that are specifically 
modulated in response to the agent. Comparing expression profiles between normal tissue 
and agent treated prostate cancel" tissue reveals genes that are not expressed in normal tissue 
or prostate cancer tissue, but are expressed in agent treated tissue. These agent-specific 
sequences can be identified and used by methods described herein for prostate cancer genes 
or proteins. In particular these sequences and the proteins they encode find use in marking or 
identifying agent treated cells. In addition, antibodies can be raised against the agent induced 
proteins and used to target novel therapeutics to the treated prostate cancer tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of prostate 
cancer cells, that have an associated prostate cancer expression profile. By "administration" 
or "contacting" herein is meant that the candidate agent is added to the cells in such a maimer 
as to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 
candidate agent (e.g., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that expression of the peptide agent is 
accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems can also be used. 

Once the test compound has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 
period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.g., prostate cancer or non-malignant tissue may be screened for agents that 
modulate, e.g., induce or suppress the prostate cancer or related phenotype. A change in at 
least one gene, preferably many, of the expression profile indicates that the agent has an 
effect on prostate cancer activity. By defining such a signature for the prostate cancer 
phenotype, screens for new drugs that alter the phenotype can be devised. With this 
approach, the drug target need not be known and need not be represented in the original 
expression screening platform, nor does the level of transcript for the target protein need to 
change. 
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In a preferred embodiment, as outlined above, screens may be done on individual 
genes and gene products (proteins). That is, having identified a particular differentially 
expressed gene as important in a particular state, screening of modulators of either the 
expression of the gene or the gene product itself can be done. The gene products of 
5 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 
acids of the Tables 1 A-4. Preferably, the prostate cancer modulatory protein is a fragment. ' 
In a preferred embodiment, the prostate cancer amino acid sequence which is used to 

10 determine sequence identity or similarity is encoded by a nucleic acid of Tables 1A-4. In 
another embodiment, the sequences are naturally occurring allelic variants of a protein 
encoded by a nucleic acid of Tables 1A-4. In another embodiment, the sequences are 
sequence variants as further described herein. 

Preferably, the prostate cancer modulatory protein is a fragment of approximately 14 

15 to 24 amino acids long. More preferably the fragment is a soluble fragment. Preferably, the 
fragment includes a non-transmembrane region. In a preferred embodiment, the fragment has 
an N-terminal Cys to aid in solubility. In one embodiment, the C-terminus of the fragment is 
kept as a free acid and the N-terminus is a free amine to aid in coupling, i.e., to cysteine, 
hi one embodiment the prostate cancer proteins are conjugated to an inmiunogenic 

20 agent as discussed herein. In one embodiment the prostate cancer protein is conjugated to 
BSA 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or the 
prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 

25 measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 
animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 

30 release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth or pH changes, and changes 
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in intracellular second messengers such as cGMP. In the assays of the invention, a 
mammalian prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a prostate cancer polypeptide is first contacted with a potential modulator and 
incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, the 
prostate cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is measured using immunoassays such as western blotting, 
ELISA, and the like with an antibody that selectively binds to the prostate cancer polypeptide 
or a fragment thereof. For measurement of mRNA, amplification, e.g., using PCR, LCR, or 
hybridization assays, e.g., northern hybridization, RNAse protection, dot blotting, are 
preferred. The level of protein or mRNA is detected using directly or indirectly labeled 
detection agents, e.g., fluorescently or radioactively labeled nucleic acids, radioactively or 
enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer protein 
promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or p-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 
genes and gene products (proteins). That is, having identified a particular differentially 
expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 
genes are. sometimes referred to herein as "prostate cancer proteins." The prostate cancer 
protein may be a fragment, or alternatively, be the full length protein corresponding to a 
fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 
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In a preferred embodiment, binding assays are done. In general, purified or isolated 
gene product is used; that is, the gene products of one or more differentially expressed 
nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard immunoassays are run to determine the amount of protein present. 
5 Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate cancer 
protein and a candidate compound, and determining the binding of the compound to the 
prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
although other mammalian proteins may also be used, e.g., for the development of animal 

10 models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate cancer 
protein or the candidate agent is non-diffusably bound to an insoluble support having isolated 
sample receiving areas (e.g., a microliter plate, an array, etc.). The insoluble supports may be 

1 5 made of a composition to which the compositions can be bound, is readily separated from 
soluble material, and is otherwise compatible with the overall method of screening. The 
surface of such supports may be solid or porous and of a convenient shape. Examples of 
suitable insoluble supports include microtiter plates, arrays, membranes, and beads. These 
are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon or 

20 nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because a 
large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition should be compatible with 
the reagents and overall methods of the invention, maintain the activity of the composition, 
and be nondiffusable. Preferred methods of binding include the use of antibodies (which do 

25 not sterically block either the ligand binding site or activation sequence when the protein is 
bound to the support), direct binding to "sticky" or ionic supports, chemical crosslinking, the 
synthesis of the protein or agent on the surface, etc. Following binding of the protein or 
agent, excess unbound material is removed by washing. The sample receiving areas may 
then be blocked through incubation with bovine serum albumin (BS A), casein, or other 

3 0 innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, and a 
test compound is added to the assay. Alternatively, the candidate agent is bound to the 
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support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 
protein binding, functional assays (phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the prostate 
cancer protein may be done in a number of ways. In a preferred embodiment, the compound 
is labeled, and binding determined directly, e.g., by attaching all or a portion of the prostate 
cancer protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), 
washing off excess reagent, and determining whether the label is present on the solid support. 
Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the proteins (or 
proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by competitive 
binding assay. The competitor is a binding moiety known to bind to the target molecule (i.e., 
a prostate cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 
test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 
a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 
between 0.1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 
away. The second component is then added, and the presence or absence of the labeled 
component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 
compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
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the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 
presence of the label on the support indicates displacement. 

In an alternative embodiment, the test compound is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the prostate cancer protein with a higher affinity. Thus, if 
the test compound is labeled, the presence of the label on the support, coupled with a lack of 
competitor binding, may indicate that the test compound is capable of binding to the prostate 
cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the prostate cancer proteins, hi this 
embodiment, the methods comprise combining a prostate cancer protein and a competitor in a 
first sample. A second sample comprises a test compound, a prostate cancer protein, and a 
competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native prostate cancer protein, but cannot bind to modified prostate cancer proteins. The 
structure of the prostate cancer protein may be modeled, and used in rational drug design to 
synthesize agents that interact with that site. Drag candidates that affect the activity of a 
prostate cancer protein are also identified by screening drugs for the ability to either enhance 
or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
Incubation of samples is for a time sufficient for the binding of the agent to the protein. 
Following incubation, samples are washed free of non-specifically bound material and the 
amount of bound, generally labeled agent determined. For example, where a radiolabel is 
employed, the samples may be counted in a scintillation counter to determine the amount of 
bound compound. 

74 



WO 02/098358 



PCT/US02/17594 



A variety of other reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc., which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
5 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
10 proteins. Preferred cell types include almost any cell. The cells contain a recombinant 

nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
1 5 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In 
another example, the determinations are determined at different stages of the cell cycle 
process. 

In this way, compounds that modulate prostate cancer agents are identified. 
20 Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is provided. 
The method comprises administration of a prostate cancer inhibitor, hi another embodiment, 
25 a method of inliibiting prostate cancer or other prostate proliferative condition is provided. 
The method comprises administration of a prostate cancer inhibitor. In a further 
embodiment, methods of treating cells or individuals with prostate cancer are provided. The 
method comprises administration of a prostate cancer inhibitor. 

In one embodiment, a prostate cancer inhibitor is an antibody as discussed above, hi 
30 another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to those of 
skill in the art, as described below. 
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Soft agar growth or colony formation in suspension 

Normal cells require a solid substrate to attach and grow. When the cells are 
transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of prostate cancer sequences, which when expressed in host cells, inhibit 
abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft. 

Techniques for soft agar growth or colony formation in suspension assays are 
described in Freshney (1994) Culture of Animal Cells a Manual of Basic Technique 3d ed. 
Wiley-Liss, herein incorporated by reference. See also, the methods section of Garkavtsev, et 
al. (1996), supra, herein incorporated by reference. 
Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

hi this assay, labeling index with ( 3 H) -thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a prostate cancer-associated sequence and are grown for 24 hours at saturation density in 
non-limiting medium conditions. The percentage of cells labeling with ( 3 H)-thymidine is 
determined autoradiographically. See, Freshney (1994), supra. 
Growth factor or serum dependence 
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Transformed cells have a lower serum dependence than their normal counterparts 
(see, e.g., Temin (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) J. Exp. Med. 
131:836-879); Freshney, supra. This is in part due to release of various growth factors by the 
transformed cells. Growth factor or serum dependence of transformed host cells can be 
5 compared with that of control. 
Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 

10 Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
pp. 178-184 in Mihich (ed. 1985) Biological Responses in Cancer Plenum. Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) Angiogenesis and Cancer, Sem. Cancer Biol. 
Various techniques which measure the release of these factors are described in 

15 Freshney (1994), supra. Also, see, Unkless, et al. (1974) J. Biol. Chem. 249:4295-4305; 

Strickland and Beers (1976) J. Biol. Chem. 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" pp. 178-184 in Mihich (ed. 1985) Biological Responses in Cancer Plenum; 
and Freshney (1985) Anticancer Res . 5:111-130. 

20 Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 

25 assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 

Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 
invasion of host cells can be measured by using filters coated with Matrigel or some other 
extracellular matrix constituent. Penetration into the gel, or through to the distal side of the 

30 filter, is rated as invasiveness, and rated histologically by number of cells and distance 

moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 
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Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted. Knock- 
5 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

10 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

15 lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288-1292). Chimeric targeted mice can 
be derived according to Hogan, et al. (1988) Manipulating the Mouse Embryo: A Laboratory 
Manual CSH Press; and Robertson (ed. 1987) Teratocarcinomas and Embryonic Stem Cells: 
A Practical Approach IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 

20 used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) I 
Natl. Cancer Inst. 52:921-930), a SCID mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263-272; Selby, et al. (1980) Br. J. 
Cancer 41:52-61) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

25 normal cells of similar origin will not. In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 



Polynucleotide modulators of prostate cancer 
Antisense and RNAi Polynucleotides 
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In certain embodiments, the activity of a prostate cancer-associated protein is down- 
regulated, or entirely inhibited, by the use of antisense polynucleotide, i.e., a nucleic acid 
complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
5 Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their 
close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 

10 sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the prostate cancer protein mRNA. See, 
e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant means, 

15 or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 

Antisense molecules as used herein include antisense or sense oligonucleotides. 
Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 

20 sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for prostate cancer molecules. A preferred antisense molecule is for a 
prostate cancer sequences in Tables 1A-4, or for a ligand or activator thereof. Antisense or 
sense oligonucleotides, according to the present invention, comprise a fragment generally at 

25 least about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive 
an antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given 
protein is described in, e.g., Stein and Cohen (1988) Cancer Res. 48:2659-2668; and van der 
Krol, et al. (1988) BioTechniaues 6:958-976. 

RNA interference is a mechanism to suppress gene expression in a sequence specific 

30 manner. See, e.g., Brumelkamp, et al. (2002) Sciencexpress (21March2002); Sharp (1999) 
Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. In mammalian 
cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 
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be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 411:494- 
498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 
treatment of or validation of relevance to disease. 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 
been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 
5,254,678. Methods of preparing are well known to those of skill in the art. See, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1:39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. (1994) Human Gene Therapy 5:1151-120; and Yamada, et al. 
(1994) Virology_205: 121-126. 

Polynucleotide modulators of prostate cancer maybe introduced into a cell containing 
the target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 
substantially interfere with the ability of the ligand binding molecule, to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide- lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating prostate disorders, e.g., cancer in 
cells or organisms, are provided. In one embodiment, the methods comprise administering to 
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a patient, e.g., to a cell within the patient, an anti-prostate cancer antibody that reduces or 
eliminates the biological activity of an endogenous prostate cancer protein. Alternatively, the 
methods comprise administering to a cell or organism a recombinant nucleic acid encoding a 
prostate cancer protein. This may be accomplished in many ways. In a preferred 
5 embodiment, e.g., when the prostate cancer sequence is down-regulated in prostate cancer, 
such state may be reversed by increasing the amount of prostate cancer gene product in the 
cell. This can be accomplished, e.g., by overexpressing the endogenous prostate cancer gene 
or administering a gene encoding the prostate cancer sequence, using known gene-therapy 
techniques, e.g.. In a preferred embodiment, the gene therapy techniques include the 

10 incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 
as described in PCT/US93/03868, hereby incorporated by reference in its entirety. 
Alternatively, e.g., when the prostate cancer sequence is up-regulated in prostate cancer, the 
activity of the endogenous prostate cancer gene is decreased, e.g., by the administration of a 
prostate cancer antisense nucleic acid. 

15 In one embodiment, the prostate cancer proteins of the present invention may be used 

to generate polyclonal and monoclonal antibodies to prostate cancer proteins. Similarly, the 
prostate cancer proteins can be coupled, using standard technology, to affinity 
chromatography columns. These columns may then be used to purify prostate cancer 
antibodies useful for production, diagnostic, or therapeutic purposes. In a preferred 

20 embodiment, the antibodies are generated to epitopes unique to a prostate cancer protein; that 
is, the antibodies show little or no cross-reactivity to other proteins. The prostate cancer 
antibodies may be coupled to standard affinity chromatography columns and used to purify 
prostate cancer proteins. The antibodies may also be used as blocking polypeptides, as 
outlined above, since they will specifically bind to the prostate cancer protein. 

25 

Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer sequences is 
correlated with prostate cancer or other prostate disorders. Accordingly, disorders based on 
mutant or variant prostate cancer genes may be determined. In one embodiment, the 
30 invention provides methods for identifying cells containing variant prostate cancer genes, 
e.g., determining all or part of the sequence of at least one endogenous prostate cancer genes 
in a cell. This may be accomplished using many sequencing techniques. In a preferred 
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embodiment, the invention provides methods of identifying the prostate cancer genotype of 
an individual, e.g., detennining all or part of the sequence of at least one prostate cancer gene 
of the individual. This is generally done in at least one tissue of the individual, and may 
include the evaluation of a number of tissues or different samples of the same tissue. The 
method may include comparing the sequence of the sequenced prostate cancer gene to a 
known prostate cancer gene, e.g., a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared to the 
sequence of a known prostate cancer gene to determine if differences exist. This can be done 
using many known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the prostate cancer gene of the patient and 
the known prostate cancer gene correlates with a disease state or a propensity for a disease 
state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to determine 
the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes to 
determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 
cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer protein or 
modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery ; Lieberman (1993) Pharmaceutical Dosage Forms (vols. 1-3, Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art. Science and 
Technology of Pharmaceutical Compounding Amer. Pharma. Assn.; and Pickar (1999) 
Dosage Calculations Thomson). Adjustments for prostate cancer degradation, systemic 
versus localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction, and the severity of the 
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condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576 further discloses the use of 
compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 
5 A "patient" for the purposes of the present invention includes both humans and other 

animals, particularly mammals. Thus the methods are applicable to both human therapy and 
veterinary applications. In the preferred embodiment the patient is a mammal, preferably a 
primate, and in the most preferred embodiment the patient is human. The patient typically 
will suffer from a prostate proliferative disorder, e.g., malignant or non-malignant, and may 

10 include cancer of other related conditions or disorders. 

The administration of the prostate cancer proteins and modulators thereof of the 
present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

15 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray, or via catheter. 

The pharmaceutical compositions of the present invention comprise a prostate cancer 
protein in a form suitable for administration to a patient. In the preferred embodiment, the 
pharmaceutical compositions are in a water soluble form, such as being present as 

20 pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 

25 propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid, and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 

30 manganese, aluminum salts, and the like. Particularly preferred are the ammonium, 

potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
83 
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substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose, 
lactose, com and other starches; binding agents; sweeteners and other flavoring agents; 
coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 
forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 
molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate cancer 
protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous 
carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. These 
solutions are typically sterile and generally free of undesirable matter. These compositions 
may be sterilized by conventional, well known sterilization techniques. The compositions 
may contain pharmaceutically acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 
sodium lactate, and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight, and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., (1980) Remington's Pharmaceutical Science (15th ed.); and Hardman, et al. (eds. 
2001) Goodman & Gilman: The Pharmacological Basis of Therapeutics McGraw-Hill. 

Thus, a typical pharmaceutical composition for intravenous administration would be 
about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
84 



WO 02/098358 



PCT/US02/17594 



the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 
administrable compositions will be known or apparent to those skilled in the art, e.g., 
Remington's Pharmaceutical Science and Goodman and Gilman: The Pharmacological Basis 
5 of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments, ha therapeutic applications, 
compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially retard or arrest the disease and its complications. 

10 An amount adequate to accomplish this is defined as a "therapeutically effective dose." 
Amounts effective for this use will depend upon the severity of the disease and the general 
state of the patient's health. Single or multiple administrations of the compositions may be 
administered depending on the dosage and frequency as required and tolerated by the patient. 
The composition should provide a sufficient quantity of the agents of this invention to 

15 effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 
condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 

20 treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 

recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer, e.g., based partly on gene expression profiles. 

It will be appreciated that the present prostate cancer protein-modulating compounds 
can be administered alone or in combination with additional prostate cancer modulating 

25 compounds or with other therapeutic agent, e.g., other anti-cancer agents or treatments, 
ha numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables lA-4such as antisense polynucleotides, 
silencing RNA, or ribozymes, will be introduced into cells, in vitro or in vivo. The present 
invention provides methods, reagents, vectors, and cells useful for expression of prostate 

30 cancer-associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo 
(cell or organism-based) recombinant expression systems. 
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The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 
introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
5 plasma vectors, viral vectors, and many other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into a host cell (see, 
e.g., Berger and Kimmel (1987) Guide to Molecular Cloning Techniques from Methods in 
Enzymologv (vol. 152) Academic Press; Ausubel, et al., (eds. supplemented through 1999) 
Current Protocols Lippincott; and Sambrook, et al. (1989) Molecular Cloning: A Laboratory 

10 Manual (2d ed., Vol. 1-3) CSH Press. 

In a preferred embodiment, prostate cancer proteins and modulators are administered 
as therapeutic agents, and can be formulated as outlined above. Similarly, prostate cancer 
genes (including both the full-length sequence, partial sequences, or regulatory sequences of 
the prostate cancer coding regions) can be administered in a gene therapy application. These 

15 prostate cancer genes can include antisense applications, either as gene therapy (i.e., for 

incorporation into the genome) or as antisense compositions, as will be appreciated by those 
in the art. 

Prostate cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 

20 can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341- 
349), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g., Eldridge, et al. (1991) Molec. hnmunol. 28:287-294; Alonso, et al. 
(1994) Vaccine 12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions 
contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) 

25 Nature 344:873-875; Hu, et al. (1998) Clin Exp Immunol. 113:235-243), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn (1988) Proc. Natl. Acad. Sci. USA 85:5409-5413; 
Tarn (1996) J. Immunol. Methods 196:17-32), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379, in Kaufmann (ed. 1996) Concepts in vaccine development de 

30 Gruyter; Chakrabarti, et al. (1986) Nature 320:535-537; Hu, et al. (1986) Nature 320:537- 
540; Kieny, et al. (1986) AIDS Bio/Technology 4:790-xxx; Top, et al. (1971) J. Infect. Pis. 
124:148-154; Chanda, et al. (1990) Virology 175:535-547), particles of viral or synthetic 
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origin (see, e.g., Kofler, et al. (1996) J. Immunol. Methods 192:25-35; Eldridge, et al. (1993) 
Sem. Hematol. 30:16-24; Falo, et al. (1995) Nature Med. 7:649-653), adjuvants (Warren, et 
al. (1986) Annu. Rev. Immunol. 4:369-388; Gupta, et al. (1993) Vaccine 1 1 :293-306), 
liposomes (Reddy, et al. (1992) J. Immunol. 148:1585-1589; Rock (1996) Immunol. Today 
17:131-137), or, naked or particle absorbed cDNA (Ulmer, et al. (1993) Science 259:1745- 
1749; Robinson, et al. (1993) Vaccine 11:957-960; Shiver, et al., p. 423, in Kaufmann (ed. 
1996) Concepts in Vaccine Development de Gruyter; Cease and Berzofsky (1994) Annu. 
Rev. Immunol. 12:923-989; and Eldridge, et al. (1993) Sem. Hematol. 30:16-24). Toxin- 
targeted delivery technologies, also known as receptor mediated targeting, such as those of 
Avant Immuno therapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 
designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 
mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 
aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A, and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccmes can be administered as nucleic acid compositions wherein DNA or RNA 
encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
This approach is described, for instance, in Wolff, et al. (1990) Science 247:1465-1468 as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 
cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
(see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 
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attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
5 Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 351:456-460. A wide variety of other vectors useful 
for therapeutic administration or immunization, e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 

10 like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 
al. (2000) Mol. Med. Today 6:66-71; Shedlock, et al. (2000) J. Leuk. Biol. 68:793-806; Hipp, 
et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 
prostate cancer gene or portion of a prostate cancer gene under the control of a regulatable 

15 promoter or a tissue-specific promoter for expression in a prostate cancer patient. The 

prostate cancer gene used for DNA vaccines can encode full-length prostate cancer proteins, 
but more preferably encodes portions of the prostate cancer proteins including peptides 
derived from the prostate cancer protein. In one embodiment, a patient is immunized with a 
DNA vaccine comprising a plurality of nucleotide sequences derived from a prostate cancer 

20 gene. For example, prostate cancer-associated genes or sequence encoding subfragments of a 
prostate cancer protein are introduced into expression vectors and tested for their 
immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T cell 
responses. This procedure may provide for production of cytotoxic T lymphocyte responses 
against cells which present antigen, including intracellular epitopes. 

25 In a preferred embodiment, the DNA vaccines include a gene encoding an adjuvant 

molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the prostate cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating animal 

30 models of prostate cancer. When the prostate cancer gene identified is repressed or 

diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 
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models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g., as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
5 prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 
cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate cancer. 
As such, transgenic animals can be generated that overexpress the prostate cancer protein. 
Depending on the desired expression level, promoters of various strengths can be employed 
1 0 to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

1 5 Kits for Use in Diagnostic and/or Prognostic Applications 

For use in diagnostic, research, and therapeutic applications suggested above, kits are 
also provided by the invention. In the diagnostic and research applications such kits may 
include one of the following: assay reagents, buffers, prostate cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, silencing RNA, 

20 ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 

molecules inhibitors of prostate cancer-associated sequences, etc. A therapeutic product may 
include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (i.e., 
protocols) for the practice of the methods of this invention. While the instructional materials 

25 typically comprise written or printed materials they are not limited to such. A medium 

capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 

30 The present invention also provides for kits for screening for modulators of prostate 

cancer-associated sequences. Such kits can be prepared from readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a 
89 
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prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for 
testing prostate cancer-associated activity. Optionally, the kit contains biologically active 
prostate cancer protein. A wide variety of kits and components can be prepared according to 
the present invention, depending upon the intended user of the kit and the particular needs of 
5 the user. Diagnosis would typically involve evaluation of a plurality of genes or products. 
The genes will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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EXAMPLES 

Example 1 : Gene Chip Analyses of Expression Profiles 

Molecular profiles of various normal and cancerous tissues were determined and 
analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 
5 described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 

EXAMPLE 2: Identification of androgen dependent/independent genes 

To identify gene expression changes during the transition from androgen-dependent to 

10 androgen-independent prostate cancer, oligonucleotide microarrays ("K" chips or Affymetrix 
Eos Hu03) were interrogated with cRNAs derived from the human CWR22 prostate cancer 
xenograft model propagated in nude mice (Pretlow, et al. (1993) J. Natl. Cancer Inst. 85:394- 
398). The CWR22 xenograft is androgen-dependent when grown in male Nude mice. 
Androgen-independent sub-lines can be derived by first estabhsliing androgen-dependent 

1 5 tumors in male mice. The mice are then castrated to remove the primary source of growth 
stimulus (androgen), resulting in tumor regression. Within 3-10 months molecular events 
prompt the tumors to relapse and start growing as androgen-independent tumors. See, e.g., 
Nagabhushan, et al. (1996) Cancer Res. 56:3042-3046; Amler, et al. (2000) Cancer Res. 
60:6134-6141; and Bubendorf, et al. (1999) J. Natl. Cancer Inst. 91:1758-1764. 

20 Using the CWR22 xenograft model, tumors were grown subcutaneously in male nude 

mice. Tumors were harvested at different times after castration. The time points post- 
castration included (in days): 0, 1, 3, 4, 5, 10, 30, 40, 50, 51, 52, 59, 60, 61, 70, 79, 80, 82, 
120, and 125. Analyses also included established androgen-independent xenografts. 
Castration resulted in tumor regression. At day 120 and thereafter, the tumors relapsed and 

25 started growing in the absence of androgen. 

cRNAs were generated by in vitro transcription assays (IVTs) from the different 
samples and were hybridized to the oligonucleotide microarrays (Affymetrix Eos Hu03). 
Hybridization was measured by the average fluorescence intensity (Al), which is directly 
proportional to the expression level of the gene. 

30 Two types of analyses were applied to the results: 

Analysis A: 
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The samples were divided into different time groups which included the following 
time points post castration (in days): 1-5, 10, 30-40, 50-82, 120-125. To identify changes in 
gene expression, the following calculations were made: 

1 . The median (or mean, in case there were only 2 samples in a group) was calculated 
5 for each group. 

2. The medians (or means) for each group was compared to one-another. 

3. Genes were selected that exhibited a minimum 2 fold difference in the median (or 
mean) between any of the groups. 

4. The change in gene expression over time was analyzed for each selected gene to look 
10 for specific pattern changes. 

Only genes with an interesting expression pattern during the androgen-ablation time 
course were selected as potential new therapeutic targets and/or diagnostic markers. Among 
the 70,000 gene clusters present on HuOl and Hu02, we identified 820 gene clusters with the 
desired expression patterns. These expression patterns can be broadly defined into the 
15 following categories: 

1. Genes that are expressed early in the time course, then drop off in expression, and 
then express again with emergence of androgen-independence (hi-lo-hi pattern in Table 1A). 

2. Genes that are expressed early in the time course, then drop off in expression, and do 
not express again with emergence of androgen-independence (hi-lo-lo pattern in Table 1A). 

20 3. Genes that are not expressed early in the time course, but express only with 
emergence of androgen-independence (lo-lo-hi pattern in Table 1A). 

4. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and continue to express with emergence of androgen-independence (lo-hi-hi 
pattern in Table 1A). 

25 5. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and drop off again with emergence of androgen-independence (lo-hi-lo pattern in 
Table 1A). 

Group 1 is characterized by cell-cycle regulating genes, such as those encoding 
cyclin Bl, p21/WAFl, CDC18-homolog, cyclin A2, cyclinDl, and possible growth factors 
30 such as hAG2 (anterior gradient 2 homolog) among others. This indicates that interruption of 
growth factor and/or cell cycle pathways prevents the emergence of androgen-independent 
disease, making group 1 genes good targets for treating advanced prostate cancer. 
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Group 2 represents genes that are androgen-dependent, and do not re-express due to 
the lack of androgen signal in the androgen- independent phenotype. This group includes 
genes encoding proteins such as Fibronectin 1, which has been previously shown to be down- 
regulated with androgen-withdrawal (Amler, et al (2000) Cancer Res. 60:6134-6141). 
5 Group 3 represents genes that are up-regulated by signals that induce the androgen- 

independent phenotype. This group includes genes encoding stanniocalcin 2, c-fos proto- 
oncogene product, vascular endothelial growth factor, the cell surface protein transmembrane 
4 superfamily member 1 and adrenomedullin among others. Adrenomedullin has recently 
been shown to act as an autocrine growth factor for the andro gen-independent prostate cancer 

1 0 cell line DU145 (Rocchi, et al. (2001) Cancer Res. 61:11 96-1206), indicating that its up- 
regulation is critical for supporting an androgen-independent phenotype. Blocking 
adrenomedullin function, and/or other genes in this group, prevents the growth of androgen- 
independent tumor cells. 

Group 4 represents genes that are androgen-repressed and are only expressed in the 

15 absence of androgen. This group includes genes encoding the protein tyrosine phosphatase 
interacting protein liprin-alpha 2, the CD24 antigen, and the catalytic subunit for 
phosphatidylinositol 4-kinase amongst others. Patients that are treated for advanced prostate 
cancer by hormone-ablation may have in their bodies cells that have survived hormone- 
ablation and are likely to up-regulate genes that belong to Group 4. Therefore, Group 4 gene 

20 products are particularly good therapeutic targets for treating patients undergoing hormone- 
ablation therapy. 

Group 5 represents genes that are involved in regulating signals that induce an 
androgen-independent phenotype. This group includes genes encoding Rab2 (a Ras-like G 
protein), the Son of Sevenless homolog (a GTP/GDP exchange factor involved in activating 

25 Ras-like proteins), and the p85 regulatory subunit for phosphoinositide-3-kinase (PI3-kinase). 
The PI3 -kinase pathway has been implicated in providing a survival signal to the prostate 
cancer cell line LNCaP (Lin, et al. (1999) Cancer Res. 59:2891-2897). This indicates that 
ras-like signals and signals dependent on PI3-kinase are involved in inducing the androgen- 
independent phenotype. For that reason, Group 5 gene products are particularly good 

30 therapeutic targets for treating patients undergoing hormone-ablation therapy. 
Analysis B: 
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For the second analysis, the samples were divided into 4 time groups which included 
the following time points post castration (in days): 0-1, 3-5, 10-82, >120. To identify 
changes in gene expression, the following analysis was performed: 

1 . Genes were selected that exhibited a minimum of 1 00 AI units at the 90 th percentile 
5 expression level of samples. 

2. The group mean expression levels for each gene were calculated. The genes were further 
sub-selected to exhibit a minimum 3 fold difference between the group means. 

3. An analysis of variance was then performed on selected genes. From the original 59,680 
gene clusters present on the Hu03 gene chip, only about 1 165 genes with a P value of < 0.01 

10 were identified that also exhibited the above mentioned parameters. 

4. A method was then employed for calculating the positive false discovery rate (pFDR), i.e., 
an estimate of the proportion of false-positives present in a set of findings (Storey and 
Tibshirani (2001) Technical Report, Department of Statistics, Stanford University, CA ). 
This technique was developed explicitly for use with microarray data. The procedure 

15 involves randomly assigning the membership status of each sample to a group and re- 
performing the analysis of variance. In each simulation, the number of group members (6 for 
Group 1, 9 for group 2, 15 for group 3, and 4 for group 4) remained constant, but these 
designations were shuffled and assigned to each sample at random. The permutation was 
performed 1000 times, and for each simulation, the number of findings at P < 0.01 was noted. 

20 The number of false positives under null conditions, was then divided by the number of 
actual findings (n=l 165 genes) to obtain an estimate of the proportion of false positive 
findings. After the application of a correction factor, the final estimate for the pFDR was 
about 1%. Thus, one can expect that approximately 12 of the 1 165 findings are false 
positives. 

25 5 . The approximately 1165 genes were clustered by expression pattern to identify specific 
pattern changes. Only genes with an interesting expression pattern during the androgen- 
ablation time course were selected as potential new therapeutic targets and/or diagnostic 
markers. These expression patterns can be broadly defined into the following categories: 
1 . Genes that are expressed early in the time course of androgen withdrawal, then drop off in 

30 expression, and then express again with emergence of androgen-independence (hi-lo-lo-hi 
pattern in Table 2A). 
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2. Genes that are expressed early in the time course, then drop off in expression immediately 
after androgen-withdrawal, and do not express again with emergence of androgen- 
independence (hi-lo-lo-lo pattern in Table 2A). 

3. Genes that are expressed early in the time course, then drop off in expression after several 
days of androgen withdrawal, and do not express again with emergence of androgen- 
independence (hi-hi-lo-lo pattern in Table 2A). 

4. Genes that are not expressed early in the time course, but express only with emergence of 
androgen-independence (lo-lo-lo-hi pattern in Table 2A). 

5. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and continue to express with emergence of androgen-independence (lo-lo-hi-hi 
pattern in Table 2A). 

6. Genes that are not expressed early in the time course, but then express as androgen is 
withdrawn and drop off again with emergence of androgen-independence (lo-lo-hi-lo pattern 
in Table 2A). 

Group 1 is characterized by cell-cycle regulating genes and cell growth promoting 
genes, such as those encoding cyclin Bl and CDC45 among others, growth factors/hormones 
such as hAG2 (anterior gradient 2 homolog), adrenomedullin, and stanniocalcin 2 among 
others, and growth factor receptors, such as the bone morphogenic protein receptor type IB 
(BMP-RIB) and the endothelial differentiation lysophosphatidic acid G-protein-coupled 
receptor 7 among others. Adrenomedullin has recently been shown to act as an autocrine 
growth factor for the androgen-independent prostate cancer cell line DU145 (Rocchi, et al. 
(2001) Cancer Res. 61:1 196-1206), indicating that its up-regulation is critical for supporting 
an androgen-independent phenotype. This indicates that interruption of growth factor and/or 
cell cycle pathways prevents the emergence of androgen-independent disease, making group 
1 genes good targets for treating both localized and advanced prostate cancer and related 
conditions. 

Group 2 represents genes that are androgen-dependent, and do not re-express due to 
the lack of androgen signal in the androgen-independent phenotype. This group includes 
genes encoding proteins such as the endothelial protein C receptor (EPCR) and the potassium 
intermediate/small conductance calcium- activated channel (subfamily N, member 2). These 
genes represent targets for treating androgen-dependent prostate cancer and related 
conditions. 
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Group 3 also represents genes that are androgen-dependent, and do not re-express due 
to the lack of androgen signal in the androgen-independent phenotype. This group includes 
genes encoding proteins such as Fibronectin 1, which has been previously shown to be down- 
regulated with androgen-withdrawal (Amler, et al. (2000) Cancer Res. 60:6134-6141), and 
5 genes encoding signaling proteins such as Rho GTPase activating protein 1. These genes 
represent targets for treating androgen-dependent prostate cancer and related conditions. 

Group 4 represents genes that are up-regulated by signals that induce and maintain the 
androgen-independent phenotype. This group includes genes encoding potential growth 
promoting proteins such as chemokine-like factor (Unigene ID Hs. 15159), colon cancer- 

1 0 associated protein Mic 1 , and the mitogen-activated protein kinase-activated protein kinase 2. 
Blocking function of these proteins, and/or other genes in this group, prevents the growth of 
androgen-independent tumor cells and related conditions. 

Group 5 represents genes that are androgen-repressed and are only expressed in the 
absence of androgen or that are induced by the absence of androgen. This group includes 

1 5 genes encoding transcriptional regulators such as the androgen receptor, the DNA activated 
protein kinase (catalytic subunit), and nuclear factor related to kappa B binding protein 
(NFRKB), among others. Patients that are treated for advanced prostate cancer by hormone- 
ablation may have in their bodies cells that have survived hormone-ablation and are likely to 
up-regulate genes that belong to Group 5. Therefore, Group 5 gene products are particularly 

20 good therapeutic targets for treating patients undergoing hormone-ablation therapy. 

Group 6 represents genes that are involved in regulating signals that are induced 
during androgen withdrawal and that induce an androgen-independent phenotype. This group 
includes genes encoding signaling molecules such as phosphoinositide-3-kinase (class 2, 
alpha polypeptide), signal transducer and activator of transcription 2 (STAT2), phospholipase 

25 A2 (group IIA) and the protein tyrosine phosphatase interacting protein liprin-alpha 2, cell 
surface receptors such as gamma-aminobutyric acid (GABA) A receptor epsilon subunit, G 
protein-coupled receptor 48, and immune function proteins such as the major 
histocompatibility complex class II DR alpha. The PI3-kinase pathway has been implicated 
in providing a survival signal to the prostate cancer cell line LNCaP (Lin, et al. (1999) Cancer 

30 Res. 59:2891-2897). This indicates that ras-like signals and signals dependent on PI3-kinase 
are involved in inducing the androgen-independent phenotype. For that reason, Group 6 gene 
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ablation therapy. 
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TABLE 1A provides Accession numbers for genes, including expressed sequence lag (in orporal d in her entirely here and throughout the application where Accession 
numbers are provided). Genes with d i I ting pression patterr during tt i ionti ccurse were selected as potential new therapeutic targets andtor 
diagnostic markers. 820 gene clusters were identified with desired expression patterns. These expression patterns can be broadly defined into the following categories: 

1. Genes that are expressed early in (he timecourse, then drop ofr in expn i r id th i press again with emergence of androgen-independence (hi-lo-hi pattern). 

2. Genesthatareexpressedearlylnthetimecourse.thendropoffinexpressicn, rtddi no «press -gam with emergence of androgen-independence (hi-lo-lo pattern). 

3. Genes that are not expressed early in the time course, but exprp nl ilhemergen id-og i ndppendence (lo-lo-hi pattern). 

4. Genes that are not expressed early in the time course, but then express as androgen is withdrawn and continue to express with emergence of androgen-independence (lo-hi- 



5. Genes that are not expressed early in the time course, but then ex| 
pattern). 

Table 1B lists accession numbers for primekeys lacking a unigenelDin tabl 
Gene clusters were compiled using sequences derived from Genbank EST: 
Alignment Tools (DoubleTwist, Oakland California). Genbank accession nu 



as androgen is withdrawn and drop off again with emergence of androgen-independence (lo-hi-lo 

i probeset is listed a gene cluster number from which oligonucleotides were designed, 
ences comprising each cluster are listed in the "Accession" column, 
3S1A. Forea 



20 
25 

30 
35 



UnigenelD Unigene Title 

Hs.1 61002 absent in melant 
Hs.10247 
Hs.10247 



BE241880 
U77949 
Y00272 
NM.001809 



Hs.99915 
"Hs.91011 
■Hs.91011 
Hs.1 578 
Hs.75692 
Hs.98658 
'Hs.10029 
Hs.69563 
Hs.1 84572 
"Hs.1594 



X54942 

L29073 

AA044840 

BE262998 

BE280074 

AK001404 

AU077231 

AA284166 

AW247430 

A1831962 



Hs.9329 

Hs.83758 

Hs.1 139 

■Hs.251871 

Hs.85137 

Hs.23960 

"Hs.194698 

■Hs.82932 

Hs.84113 

Hs.84152 



androgen receptor (dihydrotestosterone r 
anterior gradient 2 (Xenepus laevisl horn 
anterior gradient 2 (Xenepus laevis) horn 
baculoviral IAP repeat-containing 5 (sur 



cathepsin C 

CDC6 (cell division cycle 6, S. cerevisi 

cell division cycle 2, G1 to S and G2 to 

centromere protein A (17kD) 

CH.17JS gi|5867224 

CH.21 hs gi|6117842 

CH22_DA59H18.GENSCAN.72-13 

CH22_EM:AC000097.GENSCAN.109-2 

CH22_EMAC000097.GENSCAN.B7-4 

CH22_EMAC000097.GENSCAN.67-3 

CH22_FGENES.173 1 

CH22_FGENES.173_2 

CH22_FGENES.275_1 

CH22_FGENES.275_3 

CH22_FGENES.279_2 

CH22_FGENES.280_2 

CH22_FGENES.3_2 



CH22_FGENES.411_15 



CH22_FGENES.452_14 



CH22_FGENES.452_20 
CH22_FGENES.452_21 
CH22_FGENES.465_20 



CH22_FGENES.604_2 
CH22_FGENES.604_4 
CH22_FGENES.83_1 1 
CH22_FGENES.83_13 
CH22_FGENES.83 15 
CH22_FGENES.83_16 
CH22_FGENES.83-17 
chromosome 20 open reading frame 1 



"Hs.181028 
■Hs.75752 
Hs.278544 
Hs.1 80015 



cystathionhe-beta-synthase 

Inn ich | in 1 ( i tins 
cytochrome b-5 
cytochrome co 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
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100656 BE250162 "Hs.83765 

133799 W24087 Hs,76285 

129113 BE543205 "Hs.2B8771 

332732 AF191019 Hs,8361 

108846 AL1 17452 "Hs.44155 

133903 X63692 "Hs.77462 

320099 AW411307 Hs,114311 

321960 M723883 Hs.302445 

324988 AK001379 "Hs.121028 

303274 AK001468 Hs.62180 

301804 AK001468 Hs.62180 

300551 AW408800 Hs.104859 

304541 M482561 Hs.169476 

304521 AA464716 

129075 BE250162 "Hs.83765 

111003 N52980 Hs.83765 

115536 AK001468 Hs.62180 

108857 AK001468 Hs.62180 

332397 AB027249 Hs.104741 

330714 AA263143 Hs,24596 

104636 R82252 Hs.106106 

104986 AW0B8826 Hs.22971 

105076 A1598252 Hs.37810 

105312 BE613348 "Hs.23348 
Hs.11355 
Hs.236556 
"Hs.274422 
Hs.18349 

Hs!24641 
'Hs.293380 
Hs. 1 33260 
"Hs.1 22579 
Hs.283099 



U46258 
109220 AW958181 
113158 AA328102 



BE545072 
M662240 
AK001 376 
115522 BE614337 
BE093589 
AK001330 
116130 AW183533 
116448 BE268321 



131148 AW953575 
BE514605 
131937 A1907735 
131965 W79283 
M235448 
AW836130 
300942 AW301344 
300953 M542845 
302656 BE090580 
311928 T62216 
313637 AK000742 
313832 AW271106 
316465 AW574774 
317202 M894880 
320771 R74441 
321636 AI820961 
330867 AW978991 
331442 H77381 
106654 AW075485 
106590 AI350260 
128460 T16206 
114394 T34462 

AW069807 
AW248434 
A1878857 



108886 
129241 
10497B 



332577 
116732 
106774 



fatty acid desaturase 2 

dihydrofolate reductase 

DKFZP554B167 protein 

DKFZP586A0522 protein 

hypothetical protein, estradiol i iduced 

DKFZP53SG1 51 7 protein 

DNA (cytosine-5-)-methyltransferase 1 

CDC45 (cell division cycle 45, S.cerevis 

hypothetical protein MGC10334 

hypothetical protein FLJ10549 

anillin (Drosophila Scraps homolog), act 

anillin (Drosophila Scraps homolog), act 

hypothetical protein DKFZp762E1312 

glyceraldehyde-3-phosphate dehydrogenase 

gb:zx82c1 1.s 1 Soares ovary tumor NbHOT H 



Hs.59 
Hs.47378 
Hs.38178 
Hs.48855 
Hs.38178 
Hs.208912 
Hs.15 ' 



BE327311 Hs.47166 
M687322 Hs.192843 
121503 AA412049 Hs.290347 
121748 BE536911 Hs.234545 
122860 M464414 
123477 AF217515 Hs.283532 
130338 A1375726 U.279918 
Hs.183109 
Hs. 303 125 
"Hs.289092 
Hs.21446 
Hs.35962 
Hs.46677 
Hs.75277 
Hs.1 22908 
Hs.294088 
Hs.70704 
Hs.270840 
Hs.1 26774 
Hs.1 33294 



Hs.221197 
Hs.1 59420 
Hs .286049 
Hs.301539 
Hs.237164 



Hs.31097 
Hs.27769 
Hs.165909 



dihydrofolate reductase 

anillin (Drosophila Scraps homolog), act 

anillin (Drosophila Scraps homolog), act 

PDZ-binding kinase; T-cell originated pr 

RAD51-interacling protein 

Homo sapiens cAMP-dependent protein kina 



S-phase kinase-associated protein 2 (p45 



hypothetical protein STRA1T1 1499 



cytoskeleton associated protein 2 
ESTs 

hypothetical protein C L 1 20 ;54 
hypothetical protein FLJ10461 
AF15q1 4 protein 
hypothetical protein FLJ10514 
ESTs, Moderately similar to T50635 hypof 
hypothetical protein FLJ23468 
hypothetical protein FLJ10468 
hypothetical protein FLJ23468 
hypothetical protein MGC861 



Homo sapiens NUF2R mRNA, complete cds 
gb:zx78g01 ,s1 Soares ovary tumor NbHOT H 
uncharacterized bone marrow protein BM03 
hypothetical protein 



53-ill d i 

Homo sapiens c 



I protein PIGPC1 
pienscD LJ2238C cnoh 
Homo sapiens mRNA for K1AA1 716 p-otein, 
ESTs 

PRO2000 protein 

hypothetical protein FLJ13910 

Homo sapiens, clone 1MAGE:30<8353, mRNA, 



AI216748 Hs.1 4587 



hypothetical protein 

hematological and neurological expressed 
ESTs, Weakly similar to CGHU7L collagen 
tcr-itin, light polypeptide 
hypothetical protein FU21478 
ESTs A' c'ly similar to MCATJIJMANMITOC 
ESTs 



hi io hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
lii-lo-lii 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
r,. >h, 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lc-hi 
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315618 A12B7341 



103195 AA351647 



BE244377 

AF1 74600 

X02761 

AA452181 

U71321 

AC004770 

BE540274 

L11144 

AA317089 

L18861 

X14850 

BE561617 

J04088 

AA316181 

H15471 

AK002011 

BE621719 

H59799 

AF151076 



100254 
133688 
107129 
102696 
101753 
101597 
133512 



101332 
132967 
129 2i 



AF053306 
AW361638 
U22961 



100372 
100387 
131511 



Hs.61635 
Hs.1 32898 
Hs.37558 
Hs. 173802 
Hs.42644 
Hs.25199 
Hs.98658 
Hs.36708 
Hs.278338 



HSPC150 protein similar tc ubiquitin cor 
ESTs, Weakly similar to unknown [S.cer 
eukaryotic translation elongation factor 
extra spindle poles, S. o( 1 i mo 
farnesyl-diphosphatefan^ 1 ansfe 
F-box protein Fbx20 
fibroneotin 1 

FK506-binding protein 1B (12.6 kD) 

FKSOfi-binding protein 5 

flap structure-specific endonuclease 1 



glutamic-oxaloacetic transaminase 1, sol 
gb:Human Golli-mbp gene, exon 1 . 
H2Ahistone family, member X 
H2Ahistone family, member Z 
topoisomerase (DNA) II alpha (!70kD) 
six transmembrane epithelial antigen of 
fatty add desaturasel 
hypothetical protein FLJ1 1 149 
KIAA0603genepr< 



L119964 Hs.75616 



130350 
101045 
101544 



130553 
101626 
101992 



BE267931 
AA631143 
M236291 



Hs.75514 

Hs.75212 

Hs. 179665 

"Hs.99910 

Hs.46039 

"Hs.252587 

Hs.44 

Hs.77597 

Hs.41270 

"Hs.78996 

Hs.179809 

Hs.183583 

Hs.5101 



101118 
109166 
100830 



103131 
102212 
104254 



AC004770 
BE614410 
AA227069 



Hs.23044 
Hs.173737 
.002923 Hs.78944 
83092 Hs.1608 
001034 Hs.75319 
36069 Hs.2962 
111491 Hs.75069 
111425 Hs.1 80655 
i18138 Hs.24447 
159035 Hs.1 18400 




NMJ14214 Hs.5753 

AL039104 Hs.159557 

H60720 Hs.81892 

BE562298 Hs.71827 

NMJ14791 Hs.1 84339 

D83777 "Hs.75137 

BE270734 "Hs.2795 

W27518 Hs.234489 

BE617695 Hs.286192 

BE300094 "Hs.227751 

BE300094 'Hs.227751 

AU076611 Hs.154672 

AW067805 Hs.172665 

AI859865 Hs.1 54443 

A1132988 Hs.109052 



KIAA0275 gene product 
nucleolar phosphoprotein p130 
pre-B-cell colony-enhancing factor 
gb:Human proliferatng cell nuclear anli 
gb:Human prapionyl-CoA carboxylase beta- 
inosilol(myo)-1 (or 4)-monophosphatase 2 
karyopherin alpha 2 (RAG cohort I , impor 
WAA0101 gene product 
KIAA01 12 protein; homolog of yeast ribos 
K1AA0175 gene product 
KIAA0193 gene product 
lactate dehydrogenase A 
lactate dehydrogenase B 
protein phosphatase 1, regulatory (inhib 
lectin, galaotoside-binding, soluble, 1 
lectin, galactoside-binding, soluble, 1 
methylene tetrahydrofolate dehydrogenase 
methylenetetrahydrofolate dehydrogenase 
e deficient (S. 



1 1 Jit ur r transforming 1 

pie p ihepsr n binding growth fac 

polo (Drosophia)-like kinase 

procollagen-lysine, 2-oxogiutarate 5-dio 

proliferating cell nuclear antigen 

ESTs 

serine (or cysteine) proteinase inhibito 
protein regulator of cytokinesis 1 
protein regulator of cytokinesis 1 
proieolipid protein 2 (colonic epitheliu 
RAB6 interacting, kinesln-like (rabkines 
flap structure-specific endonuclease 1 
RAD51 (S. cerevisiae) homolog (E coli Re 
rss-related C3 botulirbn tu«, , , bstrate 
regulator of G-protein signalling 2, 24k 
replication protein A3 (14kD) 
ribonucleotide reductase M2 polypeptide 
S1 00 calcium-binding protein P 
serine hydroxymethyltransferase 2 (mitoc 
serinefthreonine kinase 12 
sigma receptor (SR31 747 binding protein 
singed (Drosophila|-like (sea urchin fas 
solute carrier family 1 (neutral amino a 
clone HQ0310 PRO0310p1 
synovial sarcoma, X breakpoint 2 



hi-lo-ni 
hl-lo-hi 
hi-lc-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 

hi-lo-hi 



hi-lo-h 
hi-lo-hi 
hi-lc-hi 
hi-lo-hi 
hi-lo-hi 
hi-lc-h 
hi-lo-hi 



hi-lo-hi 
hi-lo hi 
hl-lo-hl 

hi-lo-hi 
hi-lo-hl 
hi-lo-hi 
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20 
25 



35 
40 



126645 AA316181 Hs.61635 



AA622037 Hs.166468 



131877 
100866 
133893 
130135 
130287 
126180 
101536 

103556 
300022 



AA311426 
AA479005 
L32977 



"Hs.21635 
Hs-154036 
Hs.3712 



SWI/SNF related, matrix associated, acti 
synovial sarcoma, X breakpoint 2 
programmed cell death 5 
th/midylate synthetase 
thyroid hormone receptor interactor 13 
topoisomerase (DNA) II alpha (170kD) 
general transcription factor IIIA 
transferrin receptor (pSO, CD71) 



NMJJ06002 Hs.77917 

NM_007019 "Hs.93002 

Z19002 Hs.37096 

AJ002744 Hs.246315 

NMJ01360 Hs.11806 

AF207664 Hs.8230 



UDP-N-acetyl-alpha-D-gaIactosamine:polyp 
7-dehydrocholesterol ta ' 
a disintegrin-lik 



125183 AV660804 



102146 
318538 
103554 
329365 
334282 



"Hs.301417 
"Hs.217493 
AW1 62057 Hs.78629 
AI750979 Hs.74034 
AI878826 Hs.323469 



:H.X_hs gil5868838 



CH22 



134421 
124153 



AW475081 Hs.172928 



BE387561 
Al '077333 
AU077333 
AL137517 
H84730 



AA576453 
AI138628 
AW368576 



114795 
104204 
105200 
105493 
107977 
108880 
111157 
116202 
120689 
121847 
124182 
128515 



134109 
300258 
302767 

312689 
315715 
315843 
322447 
322826 
324867 
331336 



AA328102 
AL047586 
AI188161 
AA766605 
AL109729 
BE159395 
AW134519 



Hs.22981 
"Hs.160483 
"Hs.160483 
"Hs.306201 
Hs.326391 
Hs.173484 



Hs.308058 
Hs.139851 
Hs.173484 
Hs.173484 
Hs.57655 
Hs.24641 
Hs. 10283 
Hs.144627 
"Hs.47099 
Hs.18948 
Hs.87089 
Hs.96125 



CH22_FGENES.452_5 

CH22_FGENES.499_5 

CH22_FGENES.595_2 

CH22_FGENES.6fl4_5 

collagen, type I, alpha 1 

collagen, type V, alpha 2 

DKFZP586M1523 protein 

erythrocyte membrane protein band 7.2 (s 

erythrocyte membrane protein band 7.2 (s 

hypothetical protein DKFZp56401 278 

ESTs, Highl imilal I 1 7 teii 



gb:yb98h03.s1 Stratagene lung (937210) H 
gb:nm75h11 I N I. r - h > Horn pi 
ESI ' ri, imila - ii hp crprol 



RNA binding motif protein 8B 



Hs.107801 ESTs 



BE395085 Hs 10 

W19744 Hs.180059 

AA749230 Hs.22666 

NMJ17413 Hs.303084 

AA348031 Hs.7913 



Homo sapiens cDNA FLJ 20653 fis, dona KA 
ESTs 

apelin; peptide ligand for APJ receptor 



Hs.133159 ESTs, Weakly similar to PIHUSD salivary 



AW450461 Hs.203965 ESTs 

AI284219 Hs.130749 

AA679430 Hs.191897 

AI735759 Hs.52620 

AI807883 Hs.201771 

AI624707 "Hs.5921 



311034 
108647 
124955 
113923 
310557 



AA953006 Hs.88143 
Hs.30 



AI431798 
AI581344 
X02761 
AA670052 



Hs.311389 
Hs.44276 
"Hi 52484-1 
Hs.3849 
Hs.1641 92 
Hs.127312 
"Hs.287820 
Hs.169476 



thyroid receptor interacting protein 1 5 
ESTs, Moderately similar to FT0375 nadir 
homeoboxCIO 
hypothetical protein FU22622 
hypothetical protein FLJ22041 similar to 
ESTs, Weakly similar to Y1 6LHUMAN HYPOT 
ESTs, Weakly similar to T1 7330 hypotheti 
fibronectin 1 

glyceraldehyde-3-phosphate dehydrogenase 



hNo-lli 
hi-lo-hi 
hi Io hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-hi 
hi-lo-hi 
hi-lo-hi 



hi-lo-lo 
hi-lo-lo 
hi-lo-lo 



hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo io 
hi-lo-lo 
hi-lo-lo 
hi-lo io 
hi-io-io 
hi-lo-lo 



hi-lo-lo 
hi-lo-lo 
hi-lo-lo 



hi-lo-lo 
hi-lo-lo 
hi-io-io 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
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133666 
103262 
100793 
102289 



U55184 

NMJ05756 

U56725 



AF052107 
W05150 
AA173942 
AA179949 
R34657 



134454 
302067 
105500 
100732 
129265 
117789 



105731 
105772 

Kb:-M 



Hs.90797 
"Hs.37034 
Hs.326416 
Hs.175563 



hypothelical protein FL, 
G protein-coupled recei 
heat shock 70kD protein 2 

gb:HOX C6=olass I hi 

Homo sapiens clone 23620 mRNA sequence 



homeoboxA5 
Homo sapiens mRNA; cDNA DKFZp564H1 91 5 (f 
Homo sapiens mRNA; cDNA DKFZp564N0763 (f 
ipling protein 2 (mitochondrial, pro 




103240 
321412 



134351 
125924 
130982 
133473 



■Hs.82109 
"Hs.82109 
Hs.21858 



AL050025 
D17793 
AL038450 



AW602166 
AA557660 
AA530892 
N48294 



BE174240 

H22566 

AB040927 

L43821 

AA834664 



113803 AW880709 



•Hs.76152 
Hs.171695 
Hs.46850 
"Hs.49136 
Hs.17283 

'Hs.30098 
Hs.301804 
Hs.80261 
Hs-29131 
lfe.221132 
Hc.273294 



secretory leu kocyte protease in I 
sodium channel, nonvoltage-gated 1 aipha 
solute carrier family 16 (monocarboxylic 
solute carrier family 7 (cationic amino 
solute carrier family 7 (cationic amino 
solute carrier family 7 (cationic amino 
synaptogyrln 3 
syndecan 1 
syndecan 1 

trinucleotide repeat containing 3 
troponin T1 , skeletal, slow 
UDP glycosyltransferase 2 family, polype 
vasoactive Intestinal peptide receptor 1 
villin 2 (ezrin) 

Homo sapiens mRNA; cDNA DKFZp5eH172 (fr 
hypothetical protein FLJ20151 
aldo-keto reductase family 1, member C3 
ATP2C1 calcium transport ATPase, same as 
CD24 antigen (small cell lung carcinoma 
CEGP1 protein 



dual specificity phosphatase 1 
EST 

ESTs, Moderately similar to ALU7_HUMAN A 
hypothetical protein FLJ10890 
b:QV1 H I u5 7 3 290200 092 f06 HT0573 Homo 



hypothetical protein FLJ20069 
Apobec-1 complementation fac . 
chromosome 8 open reading frame 4 
ESTs 

Homo sapiens cDNA; FLJ 23241 lis, clone C 



hypothetical protein FU22174 



hi-lo-lo 
hi-lo-ta 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 



hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-io-io 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
I i-io-io 
hi-lo-lo 
hi lo-io 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 
hi-lo-lo 



lo-hi-hi 

lO-hhhi 

lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 



lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
■vr,,„ 
lo-hi-hi 
lo-hi-hi 



314219 
315052 
331919 
133240 



AA876910 
AA446869 
AK0014B9 



Hs.48376 
Hs.134427 
Hs.119316 
Hs.242894 



Homo sapiens clone HB-2 mRNA sequence 

ESTs 

ESTs 

ADP-ribosylation factor-like 1 



lo-hi-hi 
lo-hi-hi 
lo-hi-hi 
lo-hi hi 
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131762 
129000 
105713 



"Hs.1 07767 
"Hs.107767 
•Hs.184319 



Hs.27769 
Hs.106534 
"Hs.167017 



Hs.183475 
H 107fc 7 
Hs.104696 
Hs.7780 
"Hs.42484 
Hs.5897 
Hs.295901 
"Hs.162 
"Hs.181350 
H- 1719 r 



134921 
302335 
117921 
101701 



135192 
133886 
134142 

133534 
133011 
132150 
103110 
130173 
127435 
110520 
114660 
330541 
101486 



AL137491 Hs.125511 

AJ224172 Hs.204096 

M021459 Hs.306480 

NM_002436 Hs.1861 

AF127577 Hs.155017 

AB001914 Hs.170414 

U81802 Hs.154846 

AW379130 Hs.18953 

N98569 Hs.76422 

NU 005025 Hs.78589 

AF034799 Hs.30881 

AA021469 Hs.306480 



U97276 
BE244053 
X80821 
AU077115 



Hs.321709 
Hs.77266 
Hs.79362 
Hs.302177 



G-protein-coupled receptor induced prole 
Homo sapiens clone FLB8503 PR02286 mRNA, 
Homo sapiens clone PP1057 unknown mRNA 
hypothetical protein PRQ1489 
hypothetical protein PR01489 
ESTs, Weakly similar to KIM1006 protein 
gb:za46c11.s1 Soares fetal liver spleen 
ESTs, Weakly similar to AF151800 1 CGI-4 
gb:zo20f10.s1 Stratagene colon (937204) 
ESTs, Weakly similar to MCAT.H'JMAN MITOC 
hypothetical protein FUI22625 
gamma-aminobutyrlc acid (GABA) B recepto 
group-specific component (vitamin D bind 
Homo sapiens clone 23664 and 23995 mRNA 
Homo sapiens clone 23860 mRNA sequence 
Homo sapiens clone 25061 mRNA sequence 
hyp ithcti - protein FU12806 
KIM1324 protein 

Homo sapiens mRNA; cDNA DKFZp564A072 (fr 
hypothetical protein FLJ10618 

• smRNA;cDNADKFZp586P1622(f 



insulin- iko growth factor binding prate 

kallikrein 2, prostatic 

kallikrein 3, (prostate specific antigen 

kallikrein 3, (prostate specific antigen 

Homo sapiens mRNA; cDNA DKFZp434P1 530 (f 

lipop h'lir B (uteroglobin far ember) 

Homo sapiens mRNA; cDNA DKFZp761 E21 12 (f 
membrane protein, palmitoylated 1 (55kD) 



phospholipase A2, group HA (platelets, 

ine (or cysteine) prot i i i 
protein tyrosine phosphatase, receptor t 
Homo sapiens mRNA; cDNA DKFZp761E2112 (f 
purlnergic receptor P2X, ligand-gated io 
quiescln Q6 

rctinoblastoma-likc2(p130) 



H.sapiens mRNA for rib 



=in L18 



X69086 "Hs.286161 

N54069 Hs.4082 
AA071383 

NU_002038 Hs.265827 

AA506324 Hs.1852 

NMJJ00481 Hs.102 

AA535210 "Hs.171995 

AU076801 Hs.89436 cadherin 1 7,'Ll cadherin - 



RNA binding motif protein 5 
sema domain, immunoglobulin domain (ig), 
seven In absentia (Drosophila) homolog 1 
sialyltransferase 1 (beta-galactaside al 
TAR (HIV) RNA-binding protein 1 
Homo sapiens cDNA FU1361 3 fs, clone PL 
lectin, galactoside-binding, soluble, 8 
gb:zm61d05.r1 Stratagene fibroblast (937 
interferon, alpha-inducible protein (do 



105402 
102976 
101793 
129890 
328164 



326816 
337603 
338561 



334221 
334222 
75 334578 



calcineurin-binding protein ca!safdn>1 
carbohydrate (chondroitin 6/keratan) sul 
cathepsin H 

CD59 antigen p18-20 (antigen identified 
tomosap scDNA I227I Ifis lor II 
CH.C6JS gi!5868068 
CH.07JS gi|6004473 
CH.16.p2gi|6682596 
CH,16_p2 gi|6682596 
CH.20JS gi|6552458 
CH22_C20H12.GENSCAN.16-2 
CH22_EM:AC005500.GENSCAN.421-5 
CH22_EMAC005500.GENSCAN,421-6 
CH22 FGENES.264_1 
CH22_FGENES,290_3 
CH22_FGENES.290_8 
CH22_FGENES.360J 
CH22_FGENES.360_3 
CH22 FGENES.406J 
CH22_FGENES,41-1 
CH22_FGENES.46-1 
CH22_FGENES.527_2 
CH22_FGENES.527_3 
CH22_FGENES.527_6 
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301712 



304263 
304275 
304309 



310014 
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321896 



D60745 
AI264847 
N51075 
W52448 



Hs.143600 
Hs.159263 
Hs.63931 
Hs.87359 



M838114 
AI681545 
T30290 



114877 AW024162 



M280679 
AL121523 
AA532718 
NMJ15678 



AA873285 
AA808466 
AI557212 



"Hs.17132 
Hs,17962 
Hs.31 595 
Hs.36563 



CH22 roENES.617_7 

CH22_FGENES.619 1 1 (same as BFH5) 

CH22_FGENES.683_3 

CH22_FGENES.81_8 

chromosome 21 open reading frame 5 

type II Golgi membrane protein 

collagen, type VI, alpha 2 

dachshund (Drosophila) homolog 

ESTs, Highly similar to RB18 MOUSE RAS-R 

gb:ye48b07.r1 Soares fetal liver spleen 

gb:tg97d04.x1 NCI_CGAP_CLL1 Homo sapiens 

Homo sapiens cDNA: FLJ22165 fis, clone H 



Hs.292523 ESTs 
Hs.221612 - 
He.152982 
H 107515 
Hs.300646 
Hs.76228 



AK002161 
AL042005 
AA354572 
AI433357 
AA884766 

AA612626 Hs.144871 

AA075481 Hs.111334 

BE083080 Hs.274323 

AA325633 Hs.136102 

W05608 Hs.31 2679 

AA1 15962 Hs.323423 
AA082000 

AA782347 Hs.272572 



Hs.25925 
Hs.22545 
Hs.47191 
Hs.56147 
Hs.1 12748 
Hs.102754 
Hs.25925 
Hs.22545 
Hs.47191 
Hs.56147 
Hs.1 12748 
Hs.102754 
Hs.36475 
Hs.25329 
Hs.293782 



lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 



KIAA protein (similar to mouse paladin) 



AI242754 Hs.137306 ESTs 



gb:yu66f10.r1 Weizmann Olfactory Epithe! 



gb:yi60c11,r1 Soares placenta Nb2HP Homo 
Homo sapiens cDNA FU 1 1 22S fis, clone PL 
Homo sapiens cDNA: FU21904 fis, clone H 
gb:EST1 41 92 Testis tumor Homo sapiens cD 
yeast Sec31p homolog 
tripept'dyl peptidase II 

gb:EST62857 Jurkat T-cells V Homo saplen 



KIAA0853 protein 
EST 

ESTs, Moderately similar to B Chain B, 
gb:zn26f07.r1 Stratagene neuroepithelium 
hemoglobin, alpha 2 

gb:zm05b1 1 .s 1 Stratageno corneal stroma 
gb:zm53h09.s1 Stratagene fibroblast (937 
gb:zm64c06.s1 Stratagene fibroblast (937 
ESTs, Weakly similar to KIAA0565 protci 
hypothetical protein FU23045 

3f09 1NCI_CGAP_Co18Hoi n r 

ESTs 



cDNA: FLJ21 543 fis, clone C 



Homo sapiens cDNA: FLJ21543 fis, do 



Homo sapiens mRNA; cDNA DKFZp434C201 5 (f 

hypothetical protein FU13181 

ESTs, Weakly similar to ALU1_HUMAN ALU S 

Homo sapiens cDNA FU13136 fis, clone NT 

hypothetical protein FU22635 

Human DNA sequence from clone RP5-1046G1 

hypothetical protein 

ESTs 

ESTs, Weakly similar to ALU1_HUMAN ALU 



gb:/g04f09.s1 Soares infant brain 1MB H 
gt , h( INCLCi P_l 15 Hoi c jpien 
hypothetical protein FU14146 
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"Hs.8861 ESTs 
Hs.269439 ESTs 
Hs.192531 
Hs.193247 
Hs.4055 
Hs.105887 
Hs.288529 
Hs.100691 ESTs 
Hs.145383 ESTs 



nscDNA: FU22783 fis, ch 
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lo-hi-lo 
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25 
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35 
40 
45 



310455 AI277603 Hs.145990 ESTs 
310787 AW262580 Hs.147674 KIAA1621 pn 



Hs.101316 
Hs.206132 
"Hs.119237 
Hs.302251 ' 
"Hs.127453 
Hs,151124 
Hs.122505 
Hs,290853 
Hs.126707 
Hs.269880 
Hs.204339 
Hs.151500 
Hs.131704 
Hs.222830 
Hs.193288 
Hs.282884 
Hs.125232 



312821 
313097 
313166 
313179 



314146 
314305 
314456 
314465 
314881 
314916 



315344 
315353 
315439 
315528 
315720 
315772 
315841 
316042 
316244 



317224 
317275 
317404 
317488 
317916 



AI827237 
AI280112 
AI867931 
AA602917 



Hs.245834 
Hs.279610 
Hs,293696 
Hs.184780 



AA889055 

AI660898 

AW138241 

X73608 

AI809444 

AI806867 

AW071851 

AI565071 



323045 
323091 
323262 
323410 
323645 



ESTs, Weakly similar to I38022 hypclrteli 
ESTs 

hypothetical protein FU1 1457 



s cDNA FLJ13266 fis, clone OV 



ESTs, Moderately simiar to ALU5 JUMAN A 



hypothetical protein FU10493 



Homo sapiens cDNA FLJ13580 fis, clone PL 



Hs.224988 ESTs 

Hs.152940 ESTs 

Hs.122156 ESTs 

Hs.123468 ESTs 

Hs,195602 ESTs 

Hs.210846 ESTs 

"Hs.93029 sparc/osteonectln, cwcv and kazal-like d 

Hs.202108 ESTs 

Hs.1 26594 ESTs 

Hs.130628 ESTs 



Hs.244760 ESTs 



Hs,118112 
Hs.269109 
Hs.246240 
Hs.125608 
Hs.29468 



29 1-NIB Homo sapiens cDNA clone 



AA101697 Hs.211270 

AA148950 Hs,188836 

AI902456 Hs.210761 

AL133990 Hs.190642 

AW118683 Hs.154150 

AW445014 Hs.197746 
AW972227 
T78413 
AA541323 



Homo sapiens cDNA: FLJ22735 fis, clone K 
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332314 
131517 
315352 
315498 
321489 
106099 
105726 



312226 
102034 
134671 

309575 
134332 
1329D4 



105162 
331406 
131913 



Hs .38485 
Hs.1 80877 
AW474960 Hs.1 82258 
AI579909 Hs.1 05104 
M371307 Hs.1 25056 
AW770320 Hs.222413 
R41396 Hs.1 01774 



AI459177 

NMJ12068 

NM_012068 



AI741506 
AW972771 
BE541042 



AI955040 
M315703 
AI903474 



Hs.1 7283 
Hs.1 05484 
Hs.53913 



He. 1 9 



Hs.230 
Hs.302749 
Y09763 Hs.22785 
AW1 68096 Hs.1 69476 
D86962 Hs.81875 
NM_005518 Hs.59889 
N77976 Hs.251577 



N71725 



108732 
108731 
302123 
131614 



N94126 
AL049987 
AL049443 



332430 H25350 



AL036058 'Hs.76807 
M11321 

AA355986 Hs.232068 

M31669 Hs,1735 

NM_014785 Hs.47313 

D87742 Hs.241552 

NMJ14867 Hs.5333 
Hs.198232 



iL1 33033 



'Hs.4084 
Hs.1 37476 
Hs.23440 



hypothetical protein FLJ23045 



ain(TM), 

ESTs.Mc ysimi to ALU1 b < 
ESTs, Moderately similar to ALU1.HUMAN A 
ESTs, Moderately similar to ALU7_HUMAN A 
activating transcription factor 5 
i tivating li nscn|tioP factor 5 
DnaJ (Hsp40) homolog, subfamily A, membe 
ESTs, Weakly similar to ALULHUMAN ALU S 
" ly similai (o AL I1_H i AN ALUS 
ESTs, Weakly similar to ALU1_HUMAN ALU S 
Homo sapiens cDNA FU13496 fis, clone PL 
hypothetical protein DKFZp762B226 
hypothetical protein FLJ 10890 
Homo sapiens regenerating gene type IV m 
hypothetical protein FLJ 10252 
HT01 8 protein 
ESTs, Weakly sin 
ESTs 



FK506-binding protein 9 (63 kD) 
gamma-aminobutyric acid (GABA) A recepto 
glyceraldehyde-3-phosphate dehydrogenase 
growth factor receptor-bound protein 10 
3-hydroxy-3-methylglutaryl-Coenzyrne A sy 



"H3.272572 
Hs.13423 
Hs.107476 
Hs.1 07476 



Hs.12969 
Hs.1 66361 
Hs.1 61283 
Hs.77868 




hemoglobin, alpha 2 
Homo sapiens clone 24... 
ATP synthase, H+ transporting, rritoohond 
ATP synthase, H+ transporting, mitochond 
ATPase, aminophosphollpld transporter (A 
Homo sapiens mRNA from chromosome 5q21 -2 
hypothetical protein 

Homo sapiens mRNA; cDNA DKFZp564F1 1 2 (fr 
Homo sapiens mRNA; cDNA DKFZp586N2020 (f 
ORF 

HT01 8 protein 

hypothetical protein FLJ22489 
Homo sapiens cDNA: FLJ21 930 fis, clone H 
hematopoietic PBX-inleracting protein 
major histocompatibility complex, class 



KIAA0258 cene product 
KIAA0268 protein 
KIAA0711 gene product 
KIAA0884 protein 
KIAA1025 protein 
KIAA1051 protein 
KIAA1105 protein 

Homo sapiens LUCA-15 protein mRNA, splic 
degenerative spermatocyte (homolog Droso 



Hs.76272 
Hs.163593 
Hs.98710 



retinoblastoma-bindiii'j pioteii ' 
ribosomal protein L18a 
hypothetical protein 



lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-'o 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hl-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hl o 
lo-hi-lo 
lo-hi-lo 



lo-hi-lo 
lo-hi-lo 
lo-hi-'o 
lo-hi-lo 



lo-hi-lo 

b-lii-lo 

lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 



lo-hi-lo 

lO-hi-IO 

lohi-lo 



lo-hi-'o 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 



lo-hl-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hi-lo 
lo-hl-io 



WO 02/098358 



PCT7US02/ 17594 



50 
55 





AA1 78955 


Hs. 271439 


















AI741617 


H 1 03447 








AL1 20259 


lis. 76691 


















AI750878 


Hs. 87409 






130117 


U06641 


Hs!l 50207 


UDP glycosyltransferase 2 family, polype 


lo-hi-lo 


124357 






gb:yw37g07.s1 Morton Fetal Cochlea Homo 


lo-hi-lo 




M069155 










BE567753 




BCL2/adenovirus Elo'lGkD-interacting pro 










gb:nr62h10.s1 NCI CGAP Lym3 Homo sapiens 










ib i 1 111 f 1 






M069820 


Hs. 180909 










2342 


r=STs, Moderately similar to B Chain B, 




108406 


M075424 


Hs.325505 


ESTs, Moderalely similar to HBA_HUMAN HE 






M075601 










M079347 
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gb - zn07h1 0 r1 Stratagene hNT neuron (937 
















M083376 




j n09[]08 1 Strataqene h 1" n 1 1 1-1 ' J 






















Ic-hi-lo-h 








CH.16 hs gi|5867087 












lo-hi-lo-hi 








22~E T I m i j 1 










diaphorase (NADH/NADPH) (c/'ochrome b-5 
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ESTs, Weakly similar to protease [H.sapi 
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NM 012445 


"Hs 2S8126 






























Hs 87539 


















AU076820 










AU076743 
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AA262294 


Hs.180383 








NM 001975 


'i 1 1 1 1 r 0 






















KIAA1389 protein 






AK000742 


Hs. 126774 


L2DTL protein 






AA896986 




gb:al06a08.s1 Barstead spleen HPLRB2 Horn 










q n45q10.s1 Gt i- Viln li mi Hon o 






AV655272 




novel Ras family protein 
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hypothetical protein FLJ22316 
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ESTs' 














128959 


AI580127 


Hs!l07381 


hypothetical protein FU11200 
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T53925 


Hs.107 


fibrinogen-like 1 


lo-Io-hi 


133592 


AV652066 


Hs.75113 


general transcription factor IIIA 


lo-lo-hi 


103245 


BE566343 


"Hs.28988 


glutaredoxin(thioltransferase) 


lo-lo-hi 


314785 


AI538226 


Hs.32976 


guanine nucleotide binding protein 4 




103677 


Z83806 




gb;H i mRN loi nemal ,ir|rn 




131170 


NM 014253 


'Hs.23796 


odz (odd OMen-m, Drosophila) homolog 1 


lo-lo-hi 




AW013807 


Hs.182265 


keratin 19 


lo-lo-hi 


100409 


D86957 


Hs.80712 


KIAA0202 protein 


lo-lo-hi 


133167 


AW1 62840 


Hs.6641 


kinesin family member 5C 


lo-lo-hi 


319080 


AW967646 


Hs.23023 


ESTs 


lo-lo-hi 


330706 


AF097994 


Hs.301528 


L-kynurenine/atpha-aminoadipateaminotra 
mammaglobin 2 


lo-lo-hi 


104052 


NMJ02407 


Hs.97644 


lo-lo-hi 


100547 


M57417 




gbtHomo sapiens mucin (mucin) mRNA, part 
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Hs.38022 
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Hs.20985 

Hs.1 99263 

Hs.155223 

Hs.61635 

"Hs.177582 

Hs.274509 



Hs.73793 
H 25 17 
"He.25617 
Hs.7331 
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myosin-binding protein C, sir 
novel Ras family protein 
ribosomal protein L7 
ESTs 



.irotease inhibitor, Kazal type 1 
sin3-associated polypeptide, 30kD 
Ste-20 related kinase 
stanniocalcin 2 
[transmembrane epithelial antigen of 



v-fos FBJ murine osl 
v-tbs FBJ murine osteosarcoma viral onco 
hypothetical protein FLJ22316 
Homo sapiens cDNA: FU21409 Ss, clone C 
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AA61 3257 AA622228 AA522221 AA640290 AA541668 AA61 3555 AA570587 AA420606 AA594947 AA631696 AA579361 AA541677 AA2441 58 
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332002 1881 58_1 AI579909 AW967587 M766314 AW1 73502 M831027 AA991 192 AW407403 AW299333 AI356937 AA761270 AI979179 AW207045 M4B2009 

AW070877 AA280294 AW469632 AA250879 AA251 072 
332043 262097J AA371307 AW968802AA988769AA642428AA490831 W96347 M649036 
40 332120 374423 J AA609684 M758732 

332256 349165J AW975028 AA551969 AA644028 AA689303 AI220334 AI220090 AI9254B0 N66393 
332265 1028361.1 AW770320AW119114D61961 N74375N74427 

332314 4771 92_1 R41396T25915 I I23454 AA846250 1142125 D62549 1118434 C00804 AI445466 AW206738 AI445643 T25862 AI339005 

332340 22951.1 AP000692 NM 005128 AJ237839 R82151 AB023150 AW410488 AJ003273 AW379450 M322182 R82140 AJ001857 BE004377 AW083855 

45 AI096738 AW070676 M678924 AW085802 W15495 BE551040 AI699147 AI510780 R82143 R82122 

332386 399.1 NM.000481 D14686 D13811 AU076448 BE293629 BE305090 BE250035 AA341258 F11299 F07978 AI129601 NB5119 W79101 C18828 

R09579 AW8501 08 N78273 AW95061 8 H791 61 BE008484 AI081517 AA534975 A1290815 AA741 1 22 M921395 AW591508 AI8B5347 
AI348172 N59532 W7981 5 AA9471 50 AW074248 AW751 377 AI057053 Al 34571 2 AI1 93862 H85887 M983969 AA804465 AABB6299 AI21 7497 
AI351816 F06959 AA503087 AA993107 F04230 AA550742 AA962122 AA969722 AI625176 T28574 AI767567 

5 0 332397 35955.1 AB027249 AF237709 BE245643 AW403476 AB027250 AF1 89722 AA353579 BE537775 W25389 AW962279 AW8181 97 AA449542 AA448898 

AW974523 AW19561 1 AI39331 5 AI738792 AW665895 AW574679 AA913471 AA651780 AA737663 AI015407 AI366737 AI285359 BE245537 
AA740847AW513628AI276471 AA405512AI^ 022AI I 3 "326 AA971676 AI002631 AW193686 AW118095 AI969593 BE466775 
AI375878 AI911763 AW269502 AA476576 AA768652 AI300231 AI216689 AI287319 
332430 53900.1 H25350 H28544 AI955873 N29952 N29938 R1 2730 AA2; I H 32 r 4?39 AW051288 AA598738 H62306 AI337901 AI056386 T1 8606 

5 5 H82372 AI761 586 AI88901 0 AW043582 AI765252 M620587 AM 9051 0 AI4941 28 Al 1 61 1 1 9 AI457908 AI420691 AW2361 32 AI91 71 95 AI949791 

AI433283AI1 46385 AI074325 H6221 0 AA8461 54 AI3447 1 5 Al 982957 AA524256 R39782 AI360821 A1124983 AA723581 AI289068 AW1 37304 
AW0731 1 6 W37495 AI335838 AL1 21 074 AW264699 AA865259 AI0B9458 AA782578 AA78861 8 AA595002 AH 67549 W071 31 AL120665 
332530 2356.1 M31 669 M31 682 M1 3437 AW370612 F00759 AI659282 W44452 AA600841 W44338 AA608807 AW973553 AW973542 M505620 AI458719 

AA936480 AA973451 T29876 AA577032 AIB74161 M670038 AM6991 1 AW006085 AI693790 AA872040 BE467580 BE467714 BE467700 

60 AW971 179 AA431428 AA938692 AA41 6873 AA49361 9 AI671 593 AW590794 AW01 6444 AI971 1 08 BE077433 C02533 AA593753 

332567 8509.1 AW939251 NM.005252 AU076596 V01 51 2 V01 51 2 AW579056 AA249247 AI590359 AW51 0478 AW51 82B2 BE046054 AW874080 AI268596 

M996237 AI695592 AI2441 1 7 AA290764 AA401 957 AA505878 AA42B304 W7401 8 W7401 6 AA040944 AI272071 AA745909 AA620979 
AA01 981 6 AI245094 AW009706 AA662536 AW024264 Ai268601 AA932024 AW51 3222 AW0241 69 AI659705 AA932526 AA975329 AI567603 
AI889320 AA514238 AA020837 AI623966 AA843677 AA477453 AA496353 AW372625 AV656426 K00650 W96348 N62388 R95977 AA434270 

65 AI093633 T27639 AW960245 AW881 177 R1o2t,J 1365 6 701 19315 AA337290 AA284642 AA344052 F05184 AA351062 AA378451 

AW794233 AW884380 N36951 R49879 AB022276 AA300350 AW839435 AW19170B BE220350 AA280404 AA485546 AW794235 AV654223 
AW838891 AA295986 N72823 AA335648 AA371089 AW845414 H63166 R12840AA379680AA477579 R13148 H71003 H71015 AA362156 
AW750674 AW84541 5 M366924 AW608044 AI570388 R31 5 1 1 R33906 R33921 AW663022 AW360985 AI207838 AW607239 AI672451 
AI573282 AW794752 AA370328 AW998896 AW797239 AW998912 AW794742 AI954543 AI810067 AW073373 AA370325 AW195330 C18106 

70 AW998736 R79476 AA429721 AI891081 41331^ \) 1 000 ^1630329 N99428 AI870222 AI971257 AI922196 AIB57753 

AW579397 D56749 AI925005 A1685727 AW805573 A1982678 AI784604 AI005625 AW877772 A1634947 AI950829 AA493243 BE166086 
A1801820 AI925643 AI627992AW31B704AI261318 D57757 AA887173 AW770406 AI972075 AI222254 A1675794 D58060 AI701954 D58166 
AI799500 AW805669 AW276098 AW874253 AI962991 AI2481 84 AW9S6924 AI017462 AW022260 AI885957 BE176841 AA878863 AI697419 
AW662094 AI479529 BE177025 D57403 AA50; 0998 AI985773 AA566089 AA442759 AI624670 AI460284 A1800205 

75 AI537788 AI537593 AI244382 AA583463 AA922678 AA864382 AI610837 D58070 AA844283 AA947992 N73801 AI453821 D581 84 AI678887 

AW243755 AA746085 D57742 AA757380 R44148 AA496403 BE180303 AW363528 BE006616D57395AW805507AW805511 AA617991 
AI373585 H30122 D57744 AW805501 D57691 D5814B AW873164 AW768483 D57601 M777812 AA837997 BE180123 D57599 M485387 
AW022208 D58096 N67917 W95944 AW3055C6 D5751 8 D57990 AI074096 D56521 D58151 AA428720 D56648 D57778 AW805504 D57750 
D58108 AW021706D57449 D57041 D58277 D56935 AI356974 D57023 AA018712 H27631 D57851 D57514D57268D5745BAW805646 

80 AI278945 D57323 D56986 D57539 D57829 D58078 AW805515 AI348684 D57772 R74449 BE041558 D56746 AW798485 D56640 AA985597 

D56702D56849 D56874AW581419 AA470397 D57591 AW798984T27640 N56497 D56803AA618186AW805647D57945 N23726D56637 
N23730 D56992 BE176882 BE176E E176909D5£ [ 37 AI559806 AA631437 D57464 D56718 C17030 T29278 D57377 
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92 D56934 T97774 AI473546 R74350 R84834 AA579200 D56616C03207 D57391 N52416 
D56928 R79209 D56925 AA020879 D45546 AI85376S R2C750 T09381 F01435 AW627906 D58202 AI933993 F01912 H27552 AA174191 
T16515 AW023216 AA434146 H8338' AI346751 vOI 51 2 V01512 AA576407 AW365140 AA937471 BE174681 AI568829 A1274663 R85530 
AL048225H83388AW798734 

AI826268 AW248872 H69511 A1748806 AW779557 AI992254 AIB90377 AW151271 AI356374 AI634503 AA777065 AI590131 H37767 AI889058 
H69512AA046480 N27343AI57CC08 I309 Al AW594603AW000790AI208239 AI275835 AW090294AA021587 AW273456 
AA505726 AW469424 AI400222 A'025723 BE046148 AM 2B668 BE350462 AW302B01 AI299977 AA284809 AI640358 AW470364 AI241 794 
AA650048 AW090027 H15377 AW61531 8 D60021 AI934336 AW1 1 8535 A;041281 AA614238 R85918 AW571741 AW516692 AW572232 
AW515188 AI798585 AI392825 Z40518 A1869580 AA469975 AI53781 9 AI810684 AI701744 AI370410 BE383083 Z44676 BE002481 BE002532 
AA456765 N44196 D60022 C14604 AA021099 AA284872 BE266647 AW249292 

BE568452 BE297396 AA449593 AW73249C AW069736 EE548657 AA207229 AF044588 NMJM3981 BE25B994 AW444578 AA471 151 
BE250747 AW732555 AA074582 BE336856 AW408764 AA191 1 59 BE092129 AA310614 AW958677 AA312276 AW750027 AW750046 
AW750032 AW50024 AA1 88893 AW750C54 AW408409 AW750030 BE151875 AA478509 N58721 AA195614 H70079 H75580 BE250401 
AA45451 8 AA007263 AA626405 M41 71 52 AA004230 M557354 AW8631 51 AW8S31 81 AA7021 79 AI9241 43 AI671 1 85 BE0061 98 AA190630 
AI638795 AI6091 1 3 AI056239 BE537023 BE464668 AA63441 3 BE208066 EE208833 AW250603 AI337375 AA478510 BE501624 AI814763 
AW594726 AI091 408 AA827285 AA1 891 08 AW5S41 69 BE61 8589 BE61 8040 AL1 35398 AA632206 AI0801 26 AI638180 AA725439 AI3791 07 
AI288872 H14801 AI679151 AI263619 AI55921 3 AI679722 W93249 AA552345 AA417030 AI969543 AA534494 AI038181 AA766364 AA573241 
AI754325 AW043937 BE207865 AI291 838 N73585 N73539 AW8C5051 AA808510 A1699B13 AW166044 AW104716 H05808 AA248270 
BE538022 N56013 AA621586 AA149737 D19671 AW1 92890 N54283 H73339 AA910989 BE273424 BE560082AW959012AA313552 
AW750034 BE072537 BE297947 AW732361 AA449336 D29574 

AF19101 9 NM_01 551 6 BE546494 AL1 10276 R13844 BE31 3586 BE336912 R18704R18703 AA045868 T70952 BE336901 T60387 BE149749 
BE271848BE271902AA489929 Z45402 T84360AA305745AA009451 T95706 H14907 AA299901 C03221 T72431 AW471185 AA335297 
AI2691 00 AA345072 AW9651 60 H27581 R4891 0 H25380 AA335281 AW973283 T79590 AW1 83447 T641 72 AI744097 Al 342358 AA3361 02 
AA335299 BE208375 AI140834 AA088181 AI86031 4 AI73861 3 T70902 R42077 AI884558 AA489798 AI130828 AA009735 H25381 AW612425 
R48801 H27507 H30105 H44671 AI631362 AA558470 AW014412 AA552059 AA045801 AW589435 AI039657 H14614 AA974256 R42078 
AI245758 T61886 AI559202 AI0741 39 AI81 731 3 AI041484 AA4371 38 AI61 3032 AI147B91 AI457945 AW197727 AI074399 AI758636 AI598048 
M972077 M85390 R36989 R71 936 AI867492 T40081 Z41 1 15 AA772775 T41 01 3 AI695691 T40996 AI826822 N93464 AW955524 AA088651 
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Pkey: Unique number corresponding to an Eos probeset 
Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) 

human chromosome 22." Dunham I. et al. (1999) Nature 402:489-49" 
Strand: Indicates DNA strand from which exons were predicted. 
NLposition: Indicates nucleotide positions of predicted exons. 



numbers. "Dunham I. et al." refers to the publication e 



ed "The DNA sequence of 



50 
55 



333138 
333139 
333516 



Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 
Dunham, I. et.al. 



336721 
337182 
337674 



333124 
333/43 
334221 



73381-73768 

3361208-3361369 

3367643-3367726 



19315168-19315311 
19317083-19317195 



19323493-19323590 

20842088-20842682 

21497441-21497587 

26310772-26310909 

26314767-26314849 

26376860-26376942 

29161685-29161937 

3371522-3371586 

23934889-23934962 

3332616-3332697 

3335368-3335505 

3971764-3971900 

8138219-8138392 

17089711-17089988 

3318017-3317932 

7573218-7573060 

12730944-12730387 

12732417-12732289 

13285293-13285178 

14488605-14488526 



20147708-20147502 



25764330-25764251 
2158060-2157993 
2158060-2157993 
1299296-1299194 



330033 
326213 
326816 



85177-85237 
86663-86723 
60751-60927 



94608-94785 

131060-131232 

27080-27226 
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Table 2A lists about 1 1 65 genes selected to have an interesting expression pattern during androgen withdrawal of prostate cancer tissue. These genes were selected by 
analysis of variance, such that the P value is less than 0.01, the 90*hp ill r m mumoMC verage intensity across all samples, and a comparison of any group 
means shows a minimum 3 fold change. The interesting expression patterns can be broadly defined into the Wowing categories: 

1. Genes that are expressed early in the time course of androgen withdrawal, then d-op off in expression, and tnen express again with emergence of androgen-independence 
{hi-Io-lo-hi pattern in table 2A). 

2. Genes that are expressed early in the time course, then drop off in expression immediately after androgen-withdrawal, and do not express again with emergence of androgen- 
independence (hi-lo-lo-lo pattern in table 2A). 

3. Genes that are expressed early In the time course, then drop off in expression after several days of androgen withdrawal, and do not express again with emergence of 
androgen-independence (hi-hi-lo-lo pattern in table 2A). 

4. Genes that are not expressed early in the time course, but express only with emergence of androgen-independence (lo-lo-lo-hi pattern in table 2A). 

5. Genes that are not expressed early in the time course, but then express as and og n i ithd ■ n and continue to express with emergence of androgen-independence (lo-lo- 
hi-hi pattern in tahle 2A). 

6. Genes that are not expressed early in the time course, but then express as androgen is withdrawn and drop off again with emergence of androgen-independence (lo-lo-hi-lo 
pattern in table 2A). 

Table 2B lists accession numbers for primekeys lacking a unigenelD in table 2A. For each probeset is listed a gene cluster number from which oligonucleotides were designed. 
Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" column. 

Table 2C lists genomic positioning for primekeys lacking unigene ID's and accession numbers in table 2A. For each predicted exon is listed genomic sequence source used for 
prediction. Nucleotide locations of each predicted exon are also Fisted. 

TABLE 2A: ABOUT 1165 GENES SELECTED TO HAVE AN INTERESTING EXPRESSION PATTERN DURING ANDROGEN WITHDRAWAL OF PROSTATE CANCER 
TISSUE 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title Unigene gene title 

Pattern: Broadly defined expression patterns during androgen withdrawal 



Pkey 




UnigenelD 


Unigene Title 


Pattern 


433412 


AV653729 


Hs.8185 


CGI-44 protein; sulfide dehydrogenase li 


lo-lo-hi-lo 


429097 


AK001270 


Hs.196086 


hypothetical protein FLJ10408 


lo-lo-hi-lo 


442731 
420820 
422267 
416953 


AI868167 
W26096 
AB033044 
N31537 


Hs.131044 
lls.336635 

Hs!269046 


ESTs 

Homo sapiens, clone IMAGE.-41 79462, rnRNA 

KIAA1 21 8 protein 

ESTs 


lo-lo-hi-lo 
lo-lo-hi-lo 

lo-lo-hi-lo 


413277 
410209 


1124177 
AI583661 


liases 

Hs.60548 


cathepsin O 

hypothetical protein PR01 635 


o-lo-hi-lo 
lo-lo-hi-lo 


428523 


AW974540 


Hs.98626 


ESTs 


lo-lo-hi-lo 


435847 


W93821 


Hs.39780 


CDA01 7 protein 
ESTs 




443967 


AW294013 


Hs.200942 


lo-lo-hi-lo 


440838 


AA907075 


Hs.131307 


ESTs 


lo-lo-hi-to 


404054 






Target Exon 


lo-lo-hi-lo 


431697 


H66740 


Hs.38540 


ESTs, Weakly similar to ALU4 HUMAN ALU S 


lo-lo-hi-lo 


432114 
446397 


AL036021 
AW275603 


Hs.8934 
Hs.200712 


ESTs 
ESTs 


lo-lo-hi-lo 
lo-lo-hi-lo 


414094 


H15088 


Hs.31433 


ESTs 


lo-lo-hi-lo 


424005 


AB033041 




vang (van gogh, Drosophila)-like 2 


lo-lo-hi-lo 


424401 


H67220 


Hs!l 69681 


death effector domain-containing 


lo-lo-hi-lo 


449749 


AI668611 


Hs.49760 


ESTs 


lo-lo-hi lo 


458368 


BE504731 


Hs.138827 


ESTs 


lo-lo-hi-lo 


427221 


L15409 


Hs.174007 


von Hippel-Lindau syndrome 


lo-lo-hi-lo 


432715 
425980 


AA247152 
AA366951 


Hs.200483 


ESTs, Weakly similar to KIAA1 074 protein 
gb:EST77963 Pancreas tumor III Homo sapi 


lo-lo-hi-lo 


412492 


AW962604 




gb:EST374677 MAGE resequences, MAGG Homo 


lo-lo-hi-lo 


438882 


M827695 




gb:od56c02.s1 NCI_CGAP_GCB1 Homo sapiens 


lo-lo-hi-lo 


422473 


U94780 


Hs.1 17242 


meningioma expressed antigen 6 (coilec-c 
NM 005936:1 lomo sapiens myeloid/lymphoid 


lo-lo-hi-lo 


404211 






lo-lo-hi-lo 


423019 


AI640185 


Hs.283626 


ESTs 


lo-lo-hl-lo 


443559 


AI076765 




ESTs, Moderately similar to ALUS HUMAN A 


lo-lo-hi-lo 


444291 


AI598022 


Hs!l93S89 


TAR DNA binding protein 


lo-lo-hi-lo 


428065 


AI634046 


Hs. 157313 


ESTs 


•o-lo-hi-lo 


442566 




Hs.12111 


ESTs 


lo-lo-hi-lo 


442202 


BE272862 


Hs.106534 


hypothetical protein FLJ22625 


lo-lo-hi-lo 


439456 


AI752409 


Hs.1 09314 


hypotnetical pro:ein FLJ20980 


lo-lo-hi-lo 


423476 


AL035633 




Human DNA sequence from clone RP5-1 046G1 


lo-lo-hi-lo 


437952 


D63209 


Hs.5944 


solute carrier family 1 1 (proton-coupled 


lo-lo-hi-lo 


451987 


AA815092 


Hs.77554 


Homo sapiens cDNA FLJ14967 fis, clone TH 


lo-lo-hi-lo 


453408 


AI804732 


Hs.295963 






444004 


N39842 


Hs.301444 


KIAA1673 




452691 


AA1 64842 


Hs.192619 


KIAA1600 protein 


lo-lo-hi-lo 


434865 


AW050449 


Hs.1 16507 


ESTs 


lo-lo-hi-lo 


440819 


AI809444 


Hs.202108 


ESTs 


lo-lo-hi-lo 


419526 


AI821895 


Hs.193481 


ESTs 


, lo-lo-hi-lo 


422072 


AB018255 


Hs.111138 


KIAA0712gene product 




453459 


BE047032 


Hs.257789 


ESTs 


lo-lo-hi lo 


419038 


AW1 34924 


Hs.190325 


ESTs 


lo-lo-hl-lo 


413243 


AA769266 


Hs.193657 


ESTs 


lo-lo-hi-lo 


432079 


AW972746 




gb:EST384840 MAGE resequences, MAGLHomo 


lo-lo-hi-lo 
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441328 AI982794 Hs.159473 

416508 R39769 

451066 AI758660 Hs.206132 

446017 N98238 Hs.55185 ESTs 

447104 R19085 Hs.210706 

447211 AL161961 Hs.17767 

447765 AW014112 Hs.1S1390 

429540 M85776 

444314 AH 40497 

432677 NM_004482 Hsl278611 

422091 AI906339 Hs.97927 

423028 H90946 

444040 AF204231 Hs.182982 

441111 AI806867 Hs.126594 ESTs 

418838 AW385224 Hs.35198 eclonuclt 

415999 AA172179 Hs.294029 ESTs 



ESTs, Moderately similar to ALUBJHUMAN A 



gb:EST02297 Fetal brain, Stratagene (cat 
gb:ow76b09.s1 Soares_fetalJiver_spleen_ 
phospholipase A2, group IIA (platelets, 
UDP-N-aoetyl-alpha-D-galaotosamine:polyp 



lo-lo-hi-lo 
lo-lo-hi-lo 



432527 
412093 
457121 
417280 
452445 
438624 
442343 
401416 
437176 
451663 
449295 
426848 
445467 



435284 
424332 
442369 
420717 



50 
55 



440260 AI972867 
426269 
428398 
407276 
409339 
442150 
415787 
430685 
443794 
446215 



AA811371 Hs.1 23362 

AI867931 Hs.164595 

AW613780 Hs.13500 

AK000061 Hs.101590 

AI261700 Hs.1 45544 

AL048056 Hs.23437 

AW515373 Hs.271249 

AW975028 Hs.102754 

BE242691 Hs.1 4947 

AI743770 Hs.180513 

AW173116 Hs.250103 

AB002438 Hs.29596 

AA889055 Hs.1 23468 

AA992480 Hs.129874 

AW176909 Hs.42346 

AI872360 Hs.209293 

AW137268 Hs.270954 

H72531 Hs.36190 

AI239832 Hs.1 5617 

AI801098 Hs.1 51 500 

AL038460 Hs.48948 

AI948688 Hs.266619 

AA879470 Hs.96849 

AA338919 Hs.101615 

AI565071 Hs.1 59983 

AA284447 Hs.271887 

AA838114 Hs.221612 
Hs.7130 
Hs.168950 

AI249368 Hs.98558 

AI951118 Hs.326736 

AB020686 Hs.54037 

AI368158 Hs.70983 

H01463 Hs.93534 

AI690234 Hs.191666 

N94104 Hs.29280 

AW821329 Hs.14368 

NMJW2374 Hs.167 



lo-lo-hi-lo 
lo-lo-W-lo 
lo-lo-hi-lo 



ESTs, Weakly similar to KIAA0822 prate l 
ESTs 

Homo sapiens mRNA from chromosome 5q21-2 



ESTs, Weakly similar to ALU4JHUMAN At U S 

ESTs 

ESTs 

ESTs 

Homo sapiens cDNA FU1 1 492 f s, clone ! IE 

ESTs 

ESTs 

ESTs 

ESTs 

Kosaplens mRNA; cDNA DKFZp566A1046 (f 
ESTs 

Homo sapiens breast cancer antigen NY-BR 
ectonucleotide pyrophosphates e/phosphodi 
PTPL1-associated RhoGAP 1 
ESTs 

ESTs, Weakly similar b GNMSLL retroviru 
ESTs 

SH3 domain binding glutamic acid-rich pr 
microtubule-associated protein 2 
gb:601503815F1 NIH_MGC_71 Homo sapiens c 
ENSP00000226812*:KIAA1494 protein (Fragm 



lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 



lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 



449919 
407192 
436169 
418624 
432432 



451720 
429163 
432435 

433530 

430068 



AA301270 

AI367347 Hs.44898 

AI674685 Hs.200141 
AA609200 

AA888311 Hs.17602 

AI734080 Hs.1 04211 

AA541323 Hs.1 15831 

AA371307 Hs.125056 

NMJW6379 Hs.171921 

AW602165 Hs.222399 

AW970985 Hs.290853 
AA884766 

BE218886 Hs.282070 

AW204516 Hs.31835 

BE349534 Hs.281789 

U25128 Hs.159499 
AA464964 

AA315703 Hs.199993 



gb:EST391359 MAGE reseauences, MAGP Homo 
gb:EST14192 Test's tumor Homo sapiens cD 
Homo sapiens clone TCCCTA001 51 mRNA sequ 



C12000586*:gil6330167ldbj|BAA86477.1|(A 
sema domain, immunoglobulin domain (Ig), 
CEGP1 protein 
ESTs 

gb;am20a10.s1 Soares_NFL_T_GBC_S1 Hornos 



parathyroid hormone receptor 2 
gb;zx80f10.s1 Soares ovary tumor NbHOT H 
ESTs, Weakly similar to ALUBJHUMAN !!!! 



WO 02/098358 
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30 
35 



4323M 
434609 


AA533447 
R76593 


Hs.312989 


ESTs 

gb:yi60d 1 ,r1 Soares placenta Nb2HP Homo 


lo-lo-hi-Io 
Wo-hi4o 


448760 


AA313825 


Hs.21941 


A0036 protein 


lo-lo-hi-Io 


417381 


AF164142 


Hs.82042 


solute carrier family 23 (nucleobase tra 


lo-lo-hi-Io 


456334 


T50392 


Hs.271745 


ESTs 


lo-lo-hi-Io 


435445 


AA737345 


Hs.294041 


ESTs 


lo-lo-hi-Io 


411928 


AA888624 


Hs.197289 


rab3 GTPase-activating protein, non-cata 


lo-lo-hi-Io 


438869 


AF07S009 




gb:Homo sapiens full length insert cDNA 


lo-lo-hi-Io 


423932 


T95633 


Hs.189703 


ESTs 


lo-lo-hi-Io 


422222 


AI699372 


Hs.1 93247 


hypothetical protein DKFZp434A171 




434941 


AW073202 


Hs.334825 


H nscDNAFLJ14752f lone NT 


lo-lo-hi-Io 


415736 


AA827082 


Hs.291872 


ESTs 




432722 


AA830532 


Hs.326150 


ESTs 




435511 


AA683336 


Hs.189046 


ESTs 


!"",!,' 




AW022715 


Hs.162160 


ESTs, Weakly similar to AUJ4_HU VAN ALU S 


lo-lo-hi-Io 


451141 


AW772713 


Hs.247186 


ESTs 




4505-16 


AA010200 


Hs.175551 


ESTs 


lo-lo-hi-Io 


413351 


BE086815 




ESTs 




439324 


AF086134 


Hs,94309 


ESTs 


lo-lo-hi-Io 


452688 


AA721140 


Hs.49930 


ESTs, Weakly similar to putative p150 [H 


lo-lo-hi-lo 


415669 


MM 005025 




serine (or cysteine) proteinase InhiMo 


lo-lo-M-to 


450164 


AI239923 


Hs.63931 


ESTs 


lo-lo-hi-Io 


417169 


R13550 


Hs.246773 


ESTs 


lo-lo-hi-Io 


443645 


R36475 


Hs.24321 


Homo sapiens cDNA FLJ12028 fis, clone HE 


lo-lo-hi-Io 


424878 


H57111 


Hs.221132 


ESTs 


lo-lo-hi-Io 


449618 
432572 


AI076459 
AI660840 


Hs.1 5978 
Hs.191202 


KIAA1 272 protein 

ESTs, Weakly similar to ALUE_HUMAN llil 


lo-lo-hi-lo 
lo-lo-hi-lo 


400293 


N51002 


Hs.306480 


Homo sapiens mRNA; cDNA DKFZp761E2112 (f 


lo-lo-hi-lo 


431474 


AL1 33990 


Hs.1 90642 


CEGP1 protein 


lo-lo-hi-lo 




T10707 


Hs.296355 


hypothetical p-otein FLJ23133 


lo-lo-hi-lo 


438494 


AA908678 


Hs.1 301 83 


ESTs 


lo-lo-hi-lo 


425332 


AA633306 


Hs.127279 


ESTs 


lo-lo-hi-lo 


451411 


AA017492 


Hs.1 35655 


EST 


lo-lo-hi-lo 


419972 


AL041465 


Hs.1 82982 


golgin-67 


lo-lo-hi-lo 


434804 


AA649530 


Hs.348148 


gb-.ns44f05.s1 NCI CGAP Alv1 Homo sapiens 


lo-lo-hi-Io 


442832 


AW206560 


Hs.253569 


ESTs 


lo-lo-hl-lo 


408660 


AA525775 




ESTs, Moderateiy similar to PC4259 ferri 


'o-lo-hi-lo 


432674 


AA641092 


Hs.257339 


ESTs, Weakly similar to I38022 hypotheti 


lo-lo-hi-lo 


448150 


AM72J67 




ESTs 


lo-lo-hi-lo 


450468 


AW379075 




Homo sapiens cDNA FLJ 1 221 1 fis, clone rVA 




452874 


AK001061 


fe30925 2 


hypothetical protein FLJ10199 


lo-lo-hi-lo 


412088 


AI689496 


Hs.108932 


ESTs 




443451 


AI057404 


Hs.58698 


ESTs 


lo-lo-hi-lo 


453853 


AL040600 


Hs.188083 


ESTs 


lo-lo-hi-lo 


419863 


AW952691 


Hs.93485 


Homo sapiens mRNA; cDNA DKFZp761 D1 91 (f: 


lo-lo-hl-lo 


420729 
440801 


AW964897 
AA906366 


Hs.290825 
Hi 190535 


ESTs 
ESTs 


lo-lo-hi-lo 


407284 


AI539227 


Hs.214039 


hypothetical protein FLJ23556 




428279 


AA425310 


Hs.1 55766 


ESTs, Weakly similar to A47582 B-cell gr 


o-lo-hi-lo 


«H862 


AI821940 




ESTs, Moderately similar to ALUS HUMAN A 




432340 


AA534222 




3D:nj21c02.s1 NCI_CGAP_AA1 Homo sapiens 




442048 


AA974603 




gb:op34f05.s1 Soares_NFL_T_GBC_S1 Homo s 


lo-lo-hi-lo 


418781 
450642 


T41160 


Hs.8404 
Hs.7130 


ESTs 




4516G1 
435812 


AB020650 
AA700439 


Hs.188490 


Homo sapiens, Similar to KIAA0843 protei 
ESTs 


o-lo-hi-lo 
ic lo-hi-lo 


448065 


AI459177 




ESTs, Moderately similar to ALU7_HUMAN A 


lo-lo-hi-lo 


453486 


AL039201 


Hs.173554 


ubiquinol-cytochrome c reductase core pr 


'o-lo-hi-lo 


414312 


AA155694 






lo-lo-hi-lo 


438980 


AW502384 




gb:UI-HF-BR0p-aka-f-1 2-0-Ul.r1 NIH.MGCJ 
ESTs 


lo-lo-hi-lo 


408001 


AA046458 


Hs.95296 


lo-lo-hi-lo 


421476 


AW953805 


Hs.21887 


ESTs 


ic lo-hi-lo 


414426 


D60745 


H5.25925 


Homo sapiens, clone MGC.1 5393, mRNA, corn 
ANKHZN protein 


! o-lo-h-lo 


444563 


N57057 


Hs.2841fi3 


;o- : o-hi-io 


418771 


AA807881 


Hs.25329 




c-lo-hi lo 


417843 


W07361 


Hs.22545 


Homo sapiens cDNA FLJ12935 fis, clone NT 


lo-lo-hi-lo 


415565 


AA642449 


Hs.48994 


ESTs, Weakly similar to AF1 51 800 1 CGI-4 


lo-lo-hi-lo 


419229 


AI827237 


Hs.282884 


ESTs 


lo-lo-hi-lo 


419905 


AW248229 


Hs.93659 


protein disulfide isomerase related prot 


lo-lo-hi-lo 


452870 


AW502761 


Hs.30909 


KIAA0430 gene product 


lo-lo-hi-lo 


449059 


AK000566 


Hs.98135 


hypothetical protein FLJ20559 


lo-lo-hi-lo 


416157 


NM_003243 


Hs.342874 


transforming growth factor, beta recepto' 


lo-lo-hi-lo 


439305 


AW393883 


Hs.98968 


hypothetical protein FU23058 


lo-lo-hl-lo 


419235 


AW470411 


Hs.288433 




lo-lo-hi-lo 


416640 




Hs.79404 


neuron-specific protein 


lo-lo-hi-lo 


434938 


AW500718 


Hs.8115 


Homo sapiens, clone MGC;161 59, mRNA, com 


lo-lo-hi-lo 




AI241733 


Hs.43871 
Hs.35304 


ESTs 

Homo sapiens cDNA FLJ 13655 fis, clone PL 


lo-lo-hi-lo 
! o-lo-l i-lo 


418381 




Hs.1 19237 


ESTs 


lo-lo-hi-lo 


432161 


AK0CC400 


Hs.341181 


ESTs, Weakly similar to envelope [H.sapi 


lo-lo-hl-lo 


418283 


S79895 


Hs.83942 


cathepsin K (pyenodysostosis) 


lo-lo-hi-lo 


421443 


BE550141 


Hs.1 561 48 


hypothetical protein FLJ13231 


lo-lo-hi-lo 



WO 02/098358 
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416619 
449802 
446714 
413195 
438233 
416051 



AF013168 Hs.79393 

AW901804 Hs.23984 

W73818 Hs.1 10028 

AA1 27382 Hs.22404 

W52448 Hs.56147 

AA835868 Hs.25253 

AW946276 Hs.6441 

M365752 Hs.155965 

AI557212 Hs.17132 



protease, serine, 12(neurotrypsin, moto 



414516 
420028 
430223 



AA668884 Hs.1 

AB001914 Hs.170414 

Y09763 Hs.22785 

AW960427 Hs.342874 

AB014680 Hs.8786 

NM_002514 Hs.235935 

AL049443 Hs.161283 



AA985308 Hs.283902 ESTs 



ESTs, Moderately similar to I54374 gene 

ESTs 

ESTs 

ESTs 

paired basic amino acid cleaving system 
gamma-aminobutyric acid (GABA) A recopto 
transforming growth factor, beta recepto 
ESTs M I. sin i,lar to T4345S hypotheti 
carbohydrate (N-aoe-'-' ' 



Homo sapiens mRNA; cDNA DKFZp e 586N2020 (f 



428839 AI767756 

430334 AI824719 Hs328700 ESTs 

25 439686 W40445 Hs.235857 ESTs, Weakly similar to I38022 hypotheti 

423754 NM_016181 Hs.132526 melanoma antigen 

415205 H71616 Hs.135233 ESTs 

426413 AA377823 

407204 R41933 Hs.140237 

30 430234 N29317 Hs.236463 

437143 AW204O56 Hs.3917 

445162 AB011131 Hs.12376 

415083 AI632683 Hs.27179 

442924 AA533513 Hs.93659 

35 429536 AA873016 Hs.206097 

458584 AF217518 Hs.324136 

419647 AA348947 Hs.91816 

427201 AB037860 Hs.173933 

428030 AI915228 Hs.11493 

40 411779 AA292811 lls.72050 

442482 NM 014039 Hs.8360 

417458 NM_005655 Hs.82173 

438021 AV653790 Hs.324275 

409799 D11928 Hs.76845 

45 440676 NIUL004987 Hs.1 12378 

421437 AW821252 Hs.104336 

456362 AW973003 Hs.1 79909 

407686 AW901268 Hs.126043 

431129 AL137751 Hs.263671 

5 0 431874 AW610031 Hs.323914 

448072 AI459306 Hs.24908 

436860 H12751 Hs.5327 

448770 AA326683 Hs.21992 

AA093322 Hs.301404 

AW503398 Hs.293663 

BE560870 Hs.9052 



gb:EST90a05 Synovial sarcoma Homo sapien 

ESI , r IUM LUS 

KIAA1 238 protein 
ESTs 

piccolo (presynaptic cytomatrix protein} 
H c en n N F 112 i 
protein disulfide isomerase related prot 
oncogene TC21 
PTD012 protein 
hypothetical protein 
nuclear factor l/A 

Homo sapiens cDNA FLJ13536 fis, clone PL 
nan-metastatic cells 5, protein expresse 
PTD01 2 protein 



WW domain-containing protein 1 



hypothetical protein FLJ22995 

■>1 open reading frar . . 
mRNA'cDNADKFZp434!0812{f 




440278 
441102 AA973905 

423942 AF209704 Hs.1 35723 

425254 U91985 Hs.105658 

409324 W76202 Hs.343812 

431707 R21326 Hs.267905 

423335 AB018337 Hs.127287 

429200 AA447871 Hs.194215 

-— AW1 17322 Hs.42366 

AW444448 Hs.49124 
Hs.270134 



PR01914 protein 

likely ortholog of mouse variant polyade 
RNA binding motif protein 3 
ESTs, Moderately similar to 138022 hypot 
ESTs, Weakly similar to 2004399A chromos 
intermediate filament protein syncoilin 



D, alpha p 



409604 . 
431797 
437576 I 



421501 
457952 
414630 



402106 
404384 
445123 
401757 



AI815733 
M29971 
U25750 
BE410857 



Hs.1 45807 
Hs.54773 
Hs.22862 
Hs.1 14360 



DNA fragmentation factor, 45 kD 
lipoic acid synthetase 
hypothetical protein FU10422 
KIAA0794 protein 

ESTs, Weakly similar to I38022 hypotheti 

ESTs 

ESTs 

hypothetical protein FU20280 
prolhymosln, alpha {gene sequence 28) 
hypothetical protein FU 13593 
ESTs 
ESTs 



.growth factor beta-st'mulat 
O-6-methylguanine-DNAmethyltransferase 
Human chromosome 17q21 mRNA clone 1 046:1 
gb:601301177F1 NIH_MGC_21 Homo sapiens c 
DC12 protein 

( 1 C ! i|6 |NP.O 41| 

ESTs 



lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-lo 
lo-lo-hl-lo 



lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hl-lo 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hl-hi 
lo-lo-hi-hi 
lo-lo-ii-hi 

lD-l0-h-hi 

lo-lo-hi-hi 
lo-lo-hi-ni 
lo-lo-hi-hi 
lo-lo-hi-hi 

lo-lo-hi-hl 



AI762911 Hs.1 45369 
AAB36672 Hs.130694 



lo-lo-h ; -hi 
lo-lo-hi-ni 
lo-lo-hi-hi 
lo-lo-hi-n : 
lo lo-h : -ni 
lo-lo-h'-ni 
lo-lo-hi-ni 
lo-lo-hi-ni 
lo-lo-hi-ni 
lo-lo-hi-ni 
lo-lo-hi-ni 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-ni 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
b-!o-hi-n ; 
io-fo-hr-h- 
lo-lo-hi-hi 
lo-lo-hi-rr 
lo-lo-hi-iv 
lo-lo-hi-hi 
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401563 
402786 

426484 AA379658 Hs.272759 

414343 AL036166 Hs.323378 

421970 AF227156 Hs.1 10103 

422692 BE081857 Hs.94211 

413431 AW246428 Hs.75355 

426746 J03626 Hs.2057 
400237 
402532 
402396 



448622 AL046508 Hs.270607 
400501 

452324 W81486 Hs.58648 

453146 AI338952 Hs.32194 

430445 AW892432 Hs.65307 
401750 

435236 T03890 Hs.157208 

400375 NML014115 

412151 AA100529 Hs.286232 

410498 AA355749 

405044 

413169 AW161061 Hs.62954 
402101 

455019 AW850818 

446826 AK0Q0626 Hs.16230 

412180 AW898791 Hs.1 18837 

407273 AJ132560 

452895 BE389229 Hs.30954 

416117 H19480 Hs.268787 

430934 AI792302 Hs.248141 

416309 R84694 Hs.79194 

444578 T80795 Hs.1 93702 



405435 

422694 C06003 Hs.23782 

422912 AW405973 Hs.11637 

412748 BE083158 Hs.10862 
403704 

440507 H06994 



454261 AF216077 Hs.48376 

458956 BE220675 

418367 AA326035 Hs.59236 

444553 AI167530 Hs.149380 



408122 AI432652 Hs.42824 

409958 NM.001523 Hs.57697 

408214 AL120445 Hs.77823 

421911 AL041520 

407813 AL120247 Hs.40109 

425211 M18667 Hs.1867 

442772 AW503680 Hs.5957 

419733 AW362955 Hs.224961 

428260 AW290886 Hs.86999 

427083 NM_0063S3 Hs.173497 



Eos Control 

Homo sapiens mRNA; cDNA DKFZp586l2022 (f 
C15 2 3 i|73049i I J r ' 528.1] 
C1 000887*:gi| 1 2732453|ref|XP_01 1 474.1 1 C 
KIAA1 457 protein 
coated vesicle membrane protein 



uridine monophosphate synthe'ase (oroiaf 
NM_001087':Homo sapiens angio-associated 
Target Exon 
Target Exon 
ESTs 

NM_014080:Homo sapiens dual oxide r li - 
ESTs, Weakly similar to STK2.HUMAN SERIN 
It. Pi 100 51! •:KIAA161 ro i (Fragm 
ESTs 
ESTs 



NM_012448':Homosapii 
ESTs Highly similar to ARX MOUSE HOMEOB 
NM.0141 1 5':Homo sapiens PRO01 1 3 protein 
Homo sapiens cDNA: FLJ23190 (Is, clone L 
gb:EST64459 JurkalT-cells VI Homo sapie 
NM_014630':Homo sapiens KIAA021 1 gene pr 
ESTs, Weakly similar to 2inc finger prot 
EN 5P00O 12177 5 Laminin alph clu n | 
[ L I i 1199-326 no 
hypothetical protein FLJ2061 9 
gh:CM0-NN0075-130400-332-f06 NN0075 Homo 

gb:Homo sapiens mRNA for imi ' "- 

phosphomevalonate kinase 
ESTs 



444850 AW444882 Hs.1 48483 ESTs 



C17000574:gi|8923190|rof|NP_060178.1|hy 



Target Exon 
Target Exon 

hypothetical protein FU 12847 



AW732837 Hs.42390 
AA115575 Hs.1 14914 
AI911333 Hs.1 71 689 



Homo sapiens cDNA: FU2331 3 fis, clone H 
Target Exon 

gb:yl81b07.r1 Soares infant brain 1MB H 
C7000609 , :gi|628012|pir||A53933 myosin I 
gb:ye74c04.r1 Soares fetal liver spleen 
Homo sapiens clone HB-2 mRNA sequence 
gb:ht98f11.x1 NCI CGAP U24 Homo sapiens 
hypothetical protein OKFZp434L0718 
ESTs 

NM_02481 0;Homo sapiens hypothetical prot 
ESTs, «ea<ly sim lai to HSJ2 HUMAN DNAJ 
hypothetical protein FLJ22558 
ESTs 

ENSP00000247650*:Hypotheticai 1 77.6 kDa 



BE154142 Hs.96833 ESTs 



AA310177 Hs.103931 

AA186733 Hs.292154 
BE145419 

AI088585 Hs.1 18904 

AF285120 Hs.283734 



NM_002439*:Homo sapiens mutS (E. call) h 

I protein 1 
DKFZP434B0335 protein 
stromal cell protein 

gb:IL5-HT0198-291099-009-E01 HT0198 Homo 
ESTs 

CGI-204 protein 

C1 200051 9:gi|7710046|ref|NP_05791 4.1 1 ki 
glutamine-fructose-6-phosphatetransamin 
thyroid hormone receptor interactor 8 
gb:EST49730 Gall bladder I Homo sapiens 
hypothetical protein FLJ10718 
hyaluronan synthase 1 
hypothetical protein FLJ21343 
gb:DKFZp434G2317_s1 434 (synonym: htcs3) 
KIAA0872 protein 
progastricsin (pepsinogen C) 
Homo sapiens clone 24416 mRNA sequence 
HomosaplenscDNAFLJ14415fis clone HE 
ESTs, Weakly similar to S65657 alpha-1 C- 
Sec23 (S. oerevisiae) homolog B 



lo-lo-hi-iii 
Ic-io-hi-hi 
lo-lo-hi-hi 



lo-lo-hi-hi 
lo-lo-hi hi 
lo-lo-hi-hi 
lo-lo-hi-hi 



lo-lo-hi-hi 
lo-lo-hi-hi 



lo-lo-hi-hi 
lo lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lc-nl-hl 
lo-lo-n-hi 
lo-lo-hi-hi 
lo-lo-h'-hi 
lo-lo-hi-hi 
lo-lc-h:-hi 
lo-lo-hi-hi 
lo-lo-h : -hi 



lo-lo-h'-hi 
lo-l< I 



lo-lo-hi-hi 
lo-lo-h'-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-h'-hi 
lo-lo-h'-hi 
lo-lo-h'-hi 
lo-lo-h'-hi 
lo-lo-hi-hi 
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418583 AA604379 Hs.86 

407355 AA846203 Hs.19i 

454003 AA058944 Hs.11> 

425322 U63630 Hs.15 
402240 

421867 AA481078 Hs.10' 



400277 
400995 
400818 



405610 

414242 AA749230 

420757 X78592 

400965 

401192 

404407 

401405 



433627 AF078866 

410204 AJ243425 

432642 BE297635 
400769 

433980 AA137152 
403725 

413587 AA156164 

422614 AI908006 
400275 



Hs.286049 

Hs.286241 
Hs.295362 



hypolhetical protein 

ESTs, Weakly similar to ALU1. HUMAN ALU S 
Homo sapiens, clone IMAGE:4154008,mRNA, 
protein kinase, DNA-activated, catalytic 
Target Exon 

hypothetical pratein FLJ10498 

Homo sapiens mRNA; cDNA DKFZp564H1916 (f 

hypothetical protein DKFZp762M115 

KIAA01 18 protein 

Eos Control 

C11000295*:gi|12737279|reflXP_012163.1! 
Target Exon 

C1 001 899':gi|1 2722636|ref|XP_01 0672,1 1 e 
Target Exon 

ENSP00000241065':CDNA 
doflbhyl-phosphate {UDP-N-acet>h,n • vn 

D2190*:gi|1 79|ref| 1 21 63.1 1 
Target Exon 
Target Exon 
Target Exon 

3221 jH " ( 2|i 
C9000306*:gi|1273728C|ref|X 3 J!C6682.2|k 
Homo sapiens cDNA: FU22993 (is, clone K 
early growth response 1 
heat shock 70kD pratein 9B (mortalin-2) 
Target Exon 



Target Exon 



452049 BE268289 Hs.27693 peptidylprolyl isomerase (cyclophilli 



445677 H96577 

428770 AK001667 

428403 AI393048 

434647 W74158 
402807 

413992 W26276 Hs.1 36075 

407191 AA608751 



hypothetical protein FU108C5 
leucine rich repeat (in FLII) interactin 
lipopolysaccharidc specific response-68 



411984 NMJ05419 lls.72988 

451017 BE391847 Hs.181173 
404108 

407819 R42185 Hs.102720 ESTs 

435876 AW612586 Hs.1 60271 

436716 AI433540 
401419 

424363 AW512144 Hs.346947 



RNA.U2 small nuclear 
gb:ac56h07.s1 Stratagene lung carcinoma 
Target Exon 

signal transducer and activator of trans 
hypotnoticcl protein MGC10771 
1*:gl|4235142|gb|AAD14470.1| (ACO 



G protein-coupled receptor 48 
gb:ti69g05,x1 NCI_CGAP_Kid1 1 Homosapiei 
Target Exon 

ESTs, Weakly similar to A48809 carboxyle 



415516 F11411 

423144 AW851527 

452560 BE077084 

439827 AAB46538 

419709 AA255592 

413672 BE156536 

425291 AA354572 

427403 AA402107 

430911 AW937461 

435293 AI040777 

448490 AI523897 



Hs.1 87389 ESTs 



ESTs, Weakly similar to alti 
gb:QV0-HT0368-310100-091-h10 HT0368 Homo 
gb;EST62857 Jurkat T-cells V Homo sapien 
ESTs, Moderately similar to I38022 hypot 



411690 AA669253 

414739 U838B7 

444169 AV648170 

420911 U77413 

422195 AB007903 

452704 AA027823 

425074 AA495930 

426376 N46752 

447754 AW073310 

413686 AI469213 



Hs.117170 
H5.271692 
Hs.58446 
Hs.314451 

Hs.271273 
Hs.204715 
Hs.136075 
Hs.77196 



ESTs, Weakly similar to I38022 hypotheti 
ESTs 

ESTs Weakly similar to ALU1. HUMAN ALU S 
gb:za22n1 1 ,r1 Soares fetal liver spleen 
Homo sapiens cDNA FLJ 12335 fis, clone MA 



Hs.100293 O-linked N-acetylglucosamine (GlcNAc) tr 

Hs.1 13082 KIAA0443 gene product 

Hs.149424 Homo sapiens PNAS-130 mRNA, complete cds 

Homo sapiens cDNA: FLJ 221 65 f s, clone H 
Hs.302985 ESTs 

Hs.163533 Homo sapiens cDNA FLJ14142 fis, clone MA 



Hs.3826 
Hs.231436 
Hs.20274 
Hs.136164 



telch-like protein C3IP1 

hypothetical protein FLJ200B4 

ESTs, Weakly similar to unnamed protein 

cutaneous T-cell lymphoma-associated turn 

gb;QV3-BT0296-01 1 299-022-g09 BT0296 Homo 



lo-lo-hi-hi 
lo-lo-hi-hi 
fo-io-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
io-io-hi-hi 
lo-lc-ni-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-lo 

lo-lo-hi-lo 

lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-lo 



lo-lo h hi 
lo-lo-h : -h : 



b-lo-hi- ) 
lo-lo-hi-h 

lo-lo-hi-n 



lo-lo-h'-lr 
lo-lo-lr-ni 
lo-lo-hi-hi 
lo-lo-iii-h : 
lo-lo-h-hi 
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440422 AW452696 
435819 AA731746 
413644 BE154910 



408722 AA487860 Hs.29S102 ESTs 

459710 AI701596 Hs.121S92 ESTs 

417918 AA209205 Hs.163754 hypothetical protein FU 12606 

f^lt ,™ n « NM_022095*:Homo sapiens hypothetical C2H 

424387 AI739312 Hs.284163 ANKHZN protein 

427220 AF069517 Hs.173993 RNA binding motif protein 6 

410451 BE065687 gb:RC3-BT031 6-270400-01 6-f10 BT0316 Homo 

f»7 3 NM_0C6165*:Homosapier,3 nuclear factor r 

407218 AA095473 Hs.28505 ubiqi afnu enzyme E2H (ternolc 

449312 N71673 Hs.223666 ESTs 

419612 AI498267 Hs.1 10613 KIAA0421 protein 

455272 BE148152 gb:RC4-HT0231-041199-012-b04 HT0231 Homo 

[nma NM_005177*:Homo sapiens ATPase, H+trans 

.1 30760 myosin phosphatase, target subunit 2 

.120232 ESTs 

.278793 ESTs, Weakly similar to 2195 HUMAN ZINC 

;:rr" "- 199961 ESTs, Weakly similar to ALU7~ HUMAN ALU S 

448198 BE622100 Hs.209406 ESTs, Weakly similar to I33500 zinc fing 

450488 AA0C9999 Hs.59159 ESTs,ModeratelysimilartoHPV16E1 pro 

433507 AI817336 Hs.191791 ESTs 

438995 AW748336 Hs.110613 KIAA0421 protein 

442789 AW904361 Hs.13119l ESTs, Weakly similar to ALU7 HUMAN ALU 

407251 U67611 ' IJ -' — ' 

409051 AA0B0912 

409123 AA063403 _ „, ,„, d 

416225 AA577730 Hs.188684 ESTs, Wea 

433735 AA608955 Hs.109653 ESTs 

434404 AW445034 Hs.256578 ESTs 

446667 BE161878 Hs.224806 ESTs 

447982 H22953 Hs.1 37551 ESTs 

438890 AA827756 Hs.1 35049 ESTs, Weakly similar b ALU7 HUMAN ALU S 

427882 AA640987 Hs. 1 93767 ESTs 

459680 H96982 Hs.42321 ESTs 

416632 H69480 Hs.1 41 304 ESTs 

453876 AW021748 Hs.1 10406 ESTs, Weakly similar to 138022 hypotheti 

414528 AA148950 lls.188836 ESTs 

419902 AA8C4409 Hs.1 18920 ESTs 

409542 AA503020 Hs.36563 hypothetical protein FLJ2241 8 

433560 AI925195 Hs. 130891 hypothetical protein MGC4400 

447499 AW262580 Hs.147674 protocadherinbeta 16 



gb:wd73f12.x1 NCLCGAP Lu24 Homo sapiens 
Homo sapiens mRNA; cDNA DKFZp434C2016 (f 
Hs.23558 ESTs, Weakly similar to A48042 lysosomal 
NM_019111*:Homosapiensmcjo n= j \ 
i hi !o i tililyci ipl 
Homo sapiens cDNA: FU22783 Hs, clone K 



435023 A1692552 
412156 H29487 lls.17110 
414505 R45389 
404277 

414662 AL036058 Hs.76807 
444430 AI611153 Hs.6093 
445612 N94126 Hs.12969 
403739 
403740 

411C84 T18987 Hs.125472 ESTs, Moderately similar to KIAA0877 pre 
429143 AA333327 Hs.197335 plasma glutamatecarboxvpeptidase 
443060 D78874 Hs.8944 procollagen C-endopeptidase enhancer 2 
422749 W01076 Hs.278573 CD59 antigen p18-20 (antigen identified 
429441 AJ224172 Hs.204096 lipophilin B (uteroglobin family member) 
4 4382 AW380339 Hs.8068 hematopoietic PBX-interacting protein 

3 Hs.7888 Homosapiensclone23736mRNAsequenc 

446874 AW968304 iis 56I55 

412795 BE241753 Hs,74592 

430325 AF004562 Hs.239356 

426392 AW968324 Hs.1 7384 
447448 BE244285 

415743 AA167664 Hs,14333 L 

431607 AB033097 Hs.183669 KIAA1271 pn 

411979 X85134 Hs.72984 retinobtasloma-binding protein 5 

453620 BE396163 Hs.25005 ESTs, Weakly similar to ALU5 HUMAN ALU S 

431099 Y13367 Hs.249235 phosphoinosito 2 I 

421687 AL035306 Hs.106823 hypothetical protein MGC1 4797 

439565 AF086386 Hs.145599 ESTs 

442349 W40516 Hs.132355 Homo sapiens cDNA: FU22119 lis clone H 

410096 AW245200 Hs.267400 hypothetical protein MGC5540 

^l 2452 ^ 83286 ESTs, Weakly similar to S14747 sphingomy 

431802 AL133570 Hs.270571 Homo sapiens mRNA; cDNA DKFZp434L2C1 (fr 

441715 AI929453 Hs.342655 Homo sapiens cDNA FLJ13289 (is clone 07 

458230 BE311851 Hs.6639 KIAA1624 protein 

428788 AF082283 Hs.193516 B-cellCLL/lymphoma10 

450818 AI740573 Hs.142827 P311 protein 

419576 AK002060 Hs.91251 hypothetical protein FU11198 

400401 AF159093 Homo sapiens endogenous retrovirus RAN1 

427004 AI921573 Hs.213107 ESTs 

401178 AA046772 RNA binding motif protein, X chromosome 



lo-lo-hi-h: 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hi-hi 



iO-IO-hi-lD 

lo-lo-hi-io 
lo-lo-hi-Io 
lo-lo-hi-lo 
lo-lo-hi-Io 
lo-lo-hi-lo 
o-lo-hi-lo 
lo-lo-hi-Io 
o-lo-hi-lo 
io-lo-hi-lo 
lo-lo-hi-Io 



lo-lo-hi-lo 
lo-lo-hi-Io 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 



lo-lo-hi-lo 
lo-lc ni-lo 
lo-lo-hi-lo 
Io-lo-hi-lo 
lo-lo-hi-lo 

I I hi 

io-lo-hHo 
Io-lo-hi-lo 
Io-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
lo-lo-hi-lo 
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423749 


U09848 


Hs.1 32390 


zinc linger protein 36 (KOX 18) 


lo-lo-hi-lo 




AB033070 


Hs.194408 


KIM1244 protein 


lo-lo-hi-lo 


458258 


AW406546 


Hs. 127971 


ESTs 


lo-lo-hi-lo 


429521 


BE048708 


Hs.50949 


ESTs 


lo-lo-lii-Io 


402185 






Target Exon 


lo-lo-hi-lo 


415961 


H10983 


Hs.155919 


ESTs 


lo-lo-hi-lo 


457265 


AB023212 


Hs.225967 


KIM0995 protein 


lo-lo-hi-lo 


412419 


AW948630 




gb:QV0-FT0001-050500-22&fl05 FT0001 Homo 


lo-lo-hi lo 


438397 


AA806478 


, Hs.123206 


ESTs 




440509 


BE410132 


Hs.134202 


ESTs Wetklv similar lo T17279 hypotheti 




423895 


M332215 




gb:EST36124 Embryo, 8 week 1 Homo sapien 


:o-lo-hi-lo 


400251 






NM 004651*:Homo sapiens ubiquitln sDecif 


lo-lo-hi-lo 


445094 


AW296163 


Hs.147296 


ESTs 


lo-lo-hi-lo 


432323 


AK001409 


Hs.274356 


hypothetical protein FU10547 


lo-lo-hi-lo 


444290 


AA262496 




gb:zs20f11.r1 NCI_CGAP_GCB1 Homo sapiens 


lo-lo-hi-lo 


435803 


Z44194 


Hs.4994 


transducer of ERBB2, 2 


lo-lo-hi-lo 


436905 


N31273 


Hs.42380 


ESTs 


lo-lo-hi-lo 








Target Exon 


lo-lo-hi-lo 








C19000553*:gi|12741444|ref|XP_008888.2| 


lo-lo-hi-lo 




AB018249 




small inducible cytokine subfamily A (Cy 




448176 


AI672546 


Hs.170507 


ESTs 


lo-lo-hi-lo 


409259 


AW608930 


Hs.52184 


hypothetical protein FLJ20618 


lo-lo-hi-lo 


457335 


AW969834 


Hs.303303 


ESTs 


lo-lo-hi-lo 


452444 


BE1 44022 




gb:MR0-HT01 65-1 91 1 99-004-105 HT0165 Homo 


lo-lo-hl-lo 


405429 






Target Exon 


lo-lo-hi-lo 


430103 


AA465259 




gb:aa33b03.r1 NCI CGAP GCB1 Homo sapiens 
ESTs 


lo-lo-hi-lo 


439944 


AA856767 


Hs.124623 


lo-'o-hi-lo 


411283 


AW852754 




gb:PM1-CT0247-180100-009-c05 CT0247 Homo 


lo-lo-hi-lo 


458195 


R10085 


Hs.130370 


ESTs 


lo-lo-hi-lo 


452654 


BE004783 




gb:MR2-BN01 14-270400-004-e1 1 BN01 14 Homo 


lo-lo-hi-lo 


425684 


AF000989 


Hs. 159201 


thymosin, beta 4, Y chromosome 


lo-lo-hi-lo 


429452 


AI949495 


Hs.1 33998 


Homo sapiens cDNA FLJ13202 fis, clone NT 


lo-lo-hi-lo 


431709 


AF220185 


Hs .267923 


Lnchar t ize i n/ r rial i protei r h 


lo-lo-hi-lo 


411701 


BE181659 




1 IT 1 51 i HI 1 


lo-'o-hi-lo 


430729 


AI572560 


Hs.301283 


KIAA0793 gene product 


lo-lo-hi-lo 


447476 


BE293466 


Hs.20880 


ESTs, Weakly similar to 138022 hypotheti 


lo-lo-hi-lo 




AW293661 


Hs.1 31887 






405365 






CX001212':gi|7861932|gb|AAF70445.1|(AF2 


lo-lo-hi-lo 




AA244416 




gb:nc07d11.s1 NCI CGAP Pr1 Homo sapiens 


lo-lo-hi-lo 


446103 


U90918 


Hs.1 3604 


hypothetical protein dJ462023.2 


lo-lo-hi-lo 


400986 






NM 024085' Homo sapiens hypcth t jro 


lo-lo-hi-lo 


424194 


BE245833 


Hs.1 69854 


gb.TCBAP 1 E1 908 Pediatric pre-B cell acut 


lo-lo-hi-lo 


400210 






Eos Control 


lo-lo-hi-lo 


400234 






NM 005335;Homo sapiens high density lipc 


lo-lo-hi-lo 


400235 






IM ( S5:H pi high density lip 


lo-lo-hl-lo 








NM_022170*:Homo sapiens Wiiliams-Beuren 


lo-lo-hi-lo 


433075 


NMJ02959 






lo-lo-hi-lo 


406302 






C1 6000922:gi|7499103|pif1|T20903 hypothe 


lo-lo-hi-lo 


428181 
456629 


AA423976 
AW891965 




gb:zv62h06.s1 SoaresJestis.NHT Homo sap 




426940 


AA393537 


Hsi98347 


histone deacetylase 3 

ESTs, Weakly similar to JC5308 testis-sp 




.:335S5 


AA535902 




Homo sapiens HERC2P7 pseudogene, partial 


lo-lo-hi-lo 












448631 


AI554923 




gb:te53h12.x1 Soares_NFl_T_GBC_S1 Homos 
Homo sapiens unknown mRNA sequence 


lo-lo-hi-lo 


433521 


T66087 


Hs.112482 


lo-lo-hi-lo 




AA446971 




gb:zw85f11.s1 Soares total fetus_Nb2HF8 




450739 


A1732707 


Hs. 116506 


ESTs, Weakly similar to ALU7_.HIJ.UAN ALU S 




440004 


BE397117 


Hs.1 20824 


hypothetical protein FU21845 


lo-lo-hi-lo 


403947 


NM 005032 




plastin 3 (T isoform) 


lo-lc-ni-lo 




AW410458 




chromosome 11 open reading frame2 


lo-lo-hi-lo 


402163 






1 10010 5*:gi]4537179]gb|AAD23607.1]AC00 


lo-lc-lii-lo 








ENSP00000251884:KIAA1521 protein (Fracme 




400220 






Eos Control 


lo-lo-hi-lo 


401444 






Target Exon 


lo-lo-hi-lo 




BE143703 




gb:MR0-HT01 64-1 S1 1 99-004-f03 HT0164 Homo 


lo-lo-hl-io 


400205 






Eos Control 


lolo-ni-lo 


458659 


AW749895 


Hs.332520 


Homo sapiens mRNA; cDNA DKFZp434A1 014 (f 


lo-lo-hi-lo 


428666 


AL080190 


Hs.1 89242 


Homo sapiens mRNA; cDNA DKFZp434A202 (fr 


lo-'o-ni-lo 


428442 


AA428638 


Hs.98606 


ESTs 


lo-lo-hl-lo 


440151 


AA868167 




gb:ak38e07.s1 SoresJesfeJIHT Homo sap 


lo-lo-ni-lo 


431046 


AW854382 


Hs.249126 


Homo sapiens clone 24894 mRNA sequence 






AI091173 


Hs.222362 


ESTs, Weakly similar to p40 [H.sapiensJ 


lo-lo-hi-lo 


402469 






Target Exon 


lo-lo-ni-lo 




R45481 


Hs.23719 


ESTs, Weakly similar to I38022 hypotheti 


lo-lo-hi-lo 


446893 




Hs.7110 


ESTs 


lo-lo-hi-lo 


442336 


AW340958 


Hs.7572 


ESTs 


lo-lo-hi-lo 


421290 


NM_014368 


Hs.103137 


UM homeobox protein 6 


lo-lo-hi-lo 


450374 


AA397540 


Hs.60293 


Homo sapiens clone 122482 unknown mRNA 


lo-lo-hi-lo 


402347 






Target Exon 


lo-lo-hi-lo 


415184 


AA380436 


Hs.211973 


homolog of Yeast RRP4 (ribosomal RNA pro 
TcD37homolog 


lo-lo-hi-lo 


4I5632 


U67085 


Hs.78524 


lo-lo-hi-lo 


423718 


AL1 19520 


Hs.180737 


Homo sapiens clone 23664 and 23905 mRNA 


lo-lo-hi-lo 



168 



WO 02/098358 



PCT/US02/ 17594 



20 

25 



50 
55 



449140 


AW013840 


Hs.202092 




lo-lo-hi-lo 


431241 


AA496799 


Hs,36958 


ESTs 


io-lo-hi-lo 








gb:yr88f07.r1 Soares tetal liverspleen 


lo-lo-hi-lo 


424168 




Hs.321677 


signal transducer and activator of trans 




401600 


BE247275 




U5 snRNP-specific protein, 116 kD 


b-lo-hi-'o 




AF000982 


Hs.147916 


DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 






BE047679 


Hs.152982 


^pothetic prot in Fi I 17 


lo-lo-hi-lo 




AA193646 


Hs 657 I 


Homo sapiens chromosome 19, BAC CIT4ISPC 


lo-lo-hi-lo 




M476515 


Hs.172723 


ESTs 


lo-lo-hi-lo 


455653 


BE154075 




gb:PM0-HT0339-200400-010-E05 HT0339 Homo 






H38656 


Hs.32854 


ESTs 


lo-lo-hi-lo 


457015 


M688058 


Hs .261 544 


ESTs 


lo-lo-hi-lo 


403654 






NM 003071:Homo sapiens SWI/SNF related, 


lo-lo-hi-lo 


435203 


AW957127 


Hs.294027 


ESTs 


lo-lo-hi-lo 


4C9322 


BE091159 


Hs.22687 


ESTs, Moderately similar to unnamed prot 


lo-lo-hi-lo 






Hs.166832 


ESTs 




432542 


AW083920 


Hs.16098 


claudin 2 




436125 




Hs.152895 


ESTs 


lo-lo-hl-lo 


403217 


AL1 34878 




ribosomal protein, large P2 


lo-lo-hi-'o 


434023 


AI277883 


Hs.146141 


ESTs 


lo-lo-hi-lo 




AI749893 


Hc.270532 


ESTs, Weakly similar to 138022 hypothec' 




443667 


AI129066 


Hs. 135457 


ESTs 






AA017609 


Hs.343449 


gb:ze37e01.r1 Soares retina N2b4HR Homo 


lo-lo-hi-lo 


454775 


BE1 60229 




gb:QV1-HT0413-090200-062-a12 HT0413 Hera 


lo-lo-hl-lo 


411053 


AW815061 




gb:CM0-ST0209-27!099-082-d1 0 ST0209 Homo 
voltage-gated sodium channel beta-3 subu 


lo-lo-hi-lo 


435312 


AJ243396 


Hs,4865 


lo-lo-hi-lo 


450875 


AK000724 


Hs.301553 


karyopherin alpha 6 (importin alpha 7) 


lo-lo-hi-lo 


451180 


H61899 


Hs 171937 


steroid dehydrogenase4ike 


b-lo-hi-'o 


427327 


AW501456 


Hs.288283 


Homo sapiens cDNA: FU22355 fis, clone H 


b-lo-hi-'o 


444321 


AW204210 


Hs.122275 


Homo sapiens mRNA; cDNA DKFZp564N1623 (f 


lo-lo-hl-lo 


405109 


N47812 




CGI-35 protein 


lo-b-hi-io 


450182 


AI796400 


Hs.240767 


Human DNA sequence from clone RP1-12G14 


lo lo-hi-lo 


424990 


AU076896 


Hs.154095 


zinc finger protein 143 (clone pHZ-1) 


lo-b-hHo 




AF065391 


Hs 194713 


zinc finger protein 265 


lo-lo-hi-lo 








NM 021186*:Homo sapiens zona pellucidag 


lo-lo-hl-lo 




AI524039 


Hs.192524 


ESTs 


lo-lo-hi-lo 






Hs.184361 


ESTs, Moderately similar to ALU7 HUMAN A 


lo-lo-hi-lo 


434350 


AL042940 




KIAA1682 protein 


lo-lo-hi-lo 










lo-lo-hi-lo 


442884 


AI076570 




ESTs 




400481 






Target Exon 






T51008 




gb:yb55e08.s1 Stratagene ovary (937217) 




408859 


AW291672 


Hs.258981 


ESTs 


lo-lo-hi-lo 




BE045344 


!! 274923 


ESTs, Moderately similar to unnamed prot 




427315 


AA179949 




Homo sapiens mRNA; cDNA DKFZp564N0763 (f 


lo-lo-hi-lo 


449375 


R07114 


Hs'271224 


ESTs 


b-Io-hi-lo 


419G37 


AB040959 


Hs.93836 


DKFZP434N01 4 protein 




422231 


AA443512 


Hs.101383 


ESTs 


lo-lo-hi-lo 


437210 


AA311443 


Hs.293563 


Homo sapiens mRNA; cDNA DKFZp586E2317 (f 


lo-lo-hi-lo 




AA524886 




gb:nh34f02.s1 NCI CGAP_Pr3 Homo sapiens 




446586 




Hs.268820 


ESTs 


lo-lo-hi-lo 


407949 


W21874 


Hs,247057 


ESTs, Weakly similar to 2109260A B cell 


lo-lo-hi-lo 


440296 




Hs.1 80610 


splicing factor proline/glutamine rich ( 








Hs.105484 


regenerating gene type IV 


lo-lo-hi-lo 




AA642445 


Hs.287467 


Homo sapiens cDNA FU11948 As, clone HE 


lo-lo-hi-:o 


412657 


AW976165 




gb;EST388274 MAGE resequences MAGN Homo 


lo-lo-hi-lo 


405188 






Target Exon 


lo-lo-hi-lo 


416954 


AI222358 




gb:qh04c12.x1 Soares_NFL_T_GBC_S1 Homos 


lo-lo-hi-lo 


423700 


AA232375 


Hs.58606 


SNRPN upstream reading frame 


lo-lo-hi-lo 


430288 


BE394943 


Hs.13804 


path .K4|,oteindJ462023.2 


lo-lo-hi-lo 


435184 


T67162 


Hs 135127 


ESTs, Weakly similar lo unnamed protein 


io-lo-hi-lo 


431475 


AI567669 


Hs.40342 


putative nuclear protein 


lo-lo-hi-lo 


445239 


AI217375 


Hs.170023 


ZH Weakly lil i 3A36 HUMAN COLI * 


lo-lo-hi-lo 


436151 


AK000801 


Hs.324271 


Homo sapiens cDNA FLJ20794 fis, clone CO 


lo-lo-hi-lo 


448489 


AI523875 




gb:tg97d04.x1 NCI_CGAP_CLL1 Homo sapiens 


lo-lo-hi-lo 


424470 


BE244261 


Hs.323502 


Homo sapiens cDNA: FU23539 fis, clone L 


lo-lo-hi-lo 




AI334367 






lo-lo-hi-lo 




AW517236 


Hs.335752 


ESTs 


lo lo hi- c 


414034 


U89277 


Hs.305985 


early development regulator 1 (homolog o 


lo-lo-hi-'o 


420382 


AW959165 


Hs.270034 


Homo sapiens, Similar to nuclear localiz 


lo-lo-hi-lo 


430433 


AA478883 


Hs.273766 


ESTs 


lo-b-h-lo 


435351 


T80177 


Hs.1 18064 


similar lo rat nuclear ubiquitous casein 




403218 


AL1 34878 




ribosomal protein, large P2 


lo-lc-hi-lo 


420678 


AW593288 


Hs.3530 


TLS-associated serine-arginine protein 2 




445808 


AV655234 




ESTs, Moderately similar to PC4259 ferri 


lo-lo-hi-lo 


429933 


AA765596 


Hs.1 87691 


ESTs 


lo-lo-hi-lo 


419802 


AA250950 


Hs.1 54334 


ESTs 


lo-lo-hi-lo 


425155 


W26522 


Hs.75890 


gb:32g2 Human retina cDNA randomly prime 


lo-lo-hi-lo 




N68168 




gb:za11c01.s1 Soares fetal liverspleen 


lo-lo-hi-lo 


428290 


AI932995 


Hs.1 83475 


Homo sapiens clone 25061 mRNA sequence 


lo-lo-hi-lo 


422128 


AW881145 




gb:QV0-OT0033-01 0400-1 82-a07 OT0033 Homo 


lo-lo-hi-lo 


432014 


H66741 


Hs.38540 


ESTs, Weakly similar to ALU4.HUMAN ALU S 


lo-lo-hi-lo 
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407351 
443231 


AW383165 
W87548 


Hs.1 32932 


gb:PM3-HT0344-151299-004-f07HT0344Homo 
ESTs 


lo-lo-hi-lo 
lo-io-hi-lo 


444001 


AI095087 


Hs 152299 


ESTs, Moderately similar to S65657 alpha 


lo-lo-hi-lo 


435064 


T70740 


Hs.31453 


ESTs 


lo-lo-hi-lo 


435173 


AW295645 


Hs.255451 


ESTs 


lo-lo-hi-lo 


411831 


AW994394 




gb:RC3-BN0036-060400-01 4-h 1 2 BN0036 Homo 


lo-lo-hi-lo 


446572 


AV659151 


Hs.282961 


ESTs 


lo-lo-hi-lo 


428114 


A1821548 


Hs.98363 


ESTs, Weakly similar to I38022 hypotheti 


lo-lo-hi-lo 


406207 






Target Exon 
Target Exon 


lo-lo-hi-lo 
lo-lo-hi-lo 


409451 


AF012626 


Hs.54472 


fragile X mental retardation 2 




411233 
455729 


AW833793 
BE072092 




gb:QV4-TT0008-13010(M)80-a06 TT0008 Homo 
gb:PM4-BT0532-160200.003-b11 BT0532 Homo 


lo-lo-hi-lo 


439454 


AA836120 


Hs.258958 


ESTs 


lo-lo-hi-lo 


445124 


AI806403 


Hs.143942 


ESTs 




410324 


AW292539 


Hs.30177 


ESTs 


lo-lo-hi-lo 


446548 


AI769392 


Hs.200215 


ESTs 


lo-lo-hi-lo 


416999 


AW1 95747 


Hs .21122 


hypothetical protein FLJ11830 similar to 




414553 
444647 


AI813865 
H14718 


Hs.164478 
Hs.11506 


hypothetical protein FLJ21939 similar to 
Human clone 23589 mRNA sequence 


lo-lo-hi-lo 
lo-lo-hi-lo 


418271 
407939 


NMJ00919 
W05608 


Hs.83920 
Hs.312679 


peptidylglyclne alpha-amidating monooxyg 
ESTs, Weakly similar to A49019 dyneir ne 


lo-lo-hi-lo 


432676 


AI1 87366 




gb:qf29c01,x1 Soares_testis_NHT Homo sap 


lo-lo-hi-lo 


415156 


X84908 


Hs.78060 


phosphoryiase kinase, beta 


lo-lo-hi-lo 


432679 


AI146956 


Hs.1 46723 


ESTs, Weakly similar to A53950 Iranscrip 


lo-lo-hi-lo 


412121 


AB033061 


Hs.73287 


KIAA1 235 protein 


lo-lo-hi-lo 


418858 


AW951605 


Hs.21145 


hypothetical protein RG083M05.2 


lo-lo-hi-lo 


425204 


NMJ02436 


Hs.1 861 


membrane protein, palmitoylated 1 (55kD) 


lo-lo-hi-lo 


418348 


AI537167 


Hs.96322 


hypothetical protein FLJ23560 


'o-lo-hi-lo 


410765 


AI694972 


Hs.66180 


nucleosome assembly protein 1-like 2 


'o-lo-hi-lo 


445594 
416503 


AW058463 
H98502 


Hs.1 2940 
Hs.269853 


zinc-fingers and homeoboxes 1 
ESTs 


o-lo-hi-lo 
o-lo-hi-lo 


426167 


AF039023 


Hs.167496 


RAN binding protein 6 


lo-lo-hi-lo 


451752 


AB032997 


Hs.26966 


KIAA1171 protein 


lo-lo-hi-lo 


447124 


AW976438 


Hs.1 7428 


RBP1-like protein 


lo-lo-hi-lo 


419872 


AI422951 


Hs.1 46162 


ESTs 


lo-lo-hi-lo 


443161 


AI038316 




gb:ox48c08.x1 Scares JotaLfetus_Nb2H F8_ 


lo-lo-hi-lo 


445391 


T92576 


Hs.191168 


ESTs 


;c-io-hi-io 


443801 


AW206942 


Hs.253594 


intran of: trichorhinophalangeal syndro 


'o-lo-hi-lo 


446706 


AW807631 


Hs.190488 


Homo sapiens. Similar to nuclear localiz 


lo-:o-hi-lo 


428172 


U09367 


Hs.1 82828 


zinc finger protein 136 (clone pHZ-20) 




421021 


AA808018 


Hs.1 09302 


ESTs 


lo-'o-hi-lo 


431749 


AL049263 




Homo sapiens mRNA; cDNA DKF7.p564F133 (fr 


lo-lo-hi-lo 


423784 


AK000039 


Hs!l32826 


Homo sapiens oDNA FLJ14913 fis, clone PL 


lo-lo-hi-lo 


419479 


AI288348 


Hs.23450 


mitochondrial ribosomal protein S25 




450900 


H61C05 


Hs.37902 


ESTs 


lo-lo-hi-lo 


423396 


AI382555 


Hs.1 27950 




lo-'o-hi-lo 


426137 


AL040683 


Hs.1 67031 


^Fff5MD13™prote!n 1 
ESTs 




442012 


AI733277 


Hs.1 28321 


lo-:o-hi-lo 


452271 


AA025976 


Hs.34569 


ESTs 




414882 


D79994 


Hs.77546 


Homo sapiens cDNA: FLJ21983 (is, clone H 




432195 


AJ243669 


Hs.8127 


KIAA0144gene product 


c-io-hi-io 


430217 


N47863 


Hs.1 80450 


ribosomal protein S24 


lo-;o-ni-lo 


429567 


R35606 


Hs.326800 


Human EST clone 53125 mariner transposQn 


lo-'o-hi-lo 


438810 


AW897846 


Hs.5421 


hypothetical protein DKFZp761N09121 


lo-'o-hi-lo 




BE515260 


Hs.5320 


hypothetical protein 


I0-,'0-hi-l0 


426352 


N72324 


Hs.55098 


ESTs 


lo-lo-hi-lo 


415308 


F05251 




gb:HSC04H101 normalized infant brain cDN 


lo-'o-hi-lo 


420148 


U34227 


Hs.95361 


myosin VIIA (Usher syndrome 1 B (aotosoma 


lo-lo-hi-lo 


434442 


AA737415 


Hs.1 52826 


ESTs 


lo-'o-hi-lo 


449429 


AA054224 


Hs.59847 


ESTs 


Ic-lo-hi-lo 


410245 


C17908 


Hs.194125 


ESTs 


lo lo-ni-lo 


421168 


AF1 82277 


Hs.330780 


cytochrome P450, subfamily IIB (phenobar 
ESTs 


lo-lo-hi-lo 


436237 


R11528 


Hs.271968 


lo-'o-hi-lo 


440668 


AI989538 


Hs.191074 


ESTs 


lo-'o-hi-lo 


422068 


AI807519 


Hs.1 04520 


Homo sapiens cDNA FU13694 fis, clone PL 


lo-lo-hi-lo 


410216 
439437 


BE061839 
AI207788 


Hs.343628 


gb:RC1 -BT0254-2901 00-01 5-a05 BT0254 Homo 
sialyltransferase 4B (beta-galactosidase 


lo-lo-hi-lo 
lo-lo-hi-lo 




AI675944 


Hs.1 88691 


mc i 1 - II is, clcne HE 


lo-'o-hi-lo 


403046 






NM_005656*:Homo sapiens transmembrane pr 


lo-lo-hl-lo 


404528 


AI912555 
AC005013 


Hs.149 


peptide YY, 2 (seminalplasmin) 

cAMP response element-binding protein CR 


lo-lo-hi-lo 
lo-lo-hi-lo 


452997 


N64777 


Hs.44656 


ESTs 


lo-lo-hi-lo 


403745 




Hs.271439 


V 0 5812*;!' 11 tein (Fr 
ESTs, Weakly similar to I38022 hypotheti 


lo-lo-hi-lo 
Ic-lo-hi-lo 


422460 


AW450?4 


Hs.1 97746 


ESTs 

Target Exon 


lo-lo-hi-lo 

lo-lo-hi-lo 




BE154067 


Hs.136660 


ESTs, Weakly similar to ZN91_HUVIAN ZINC 


lo-lo-hi-lo 


427702 
440695 


N76589 
AW088363 


Hs.1 4454 
Hs.246240 


ESTs, Weakly similar to TF11D subunitTA 
ESTs 


lo-lo-hi-lo 


424881 


AL1 19690 


Hs.153618 


HCGVill-1 protein 


Ic-'o-hi-hi 


440573 


BE550891 


Hs.270624 


ESTs 


lo-lo-hi-hi 



WO 02/098358 
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440184 
421996 
444252 
402082 
405396 
412457 
415808 



AA830515 Hs.222917 
W67883 Hs.137476 
Hs.126919 
AB002297 Hs.7022 
AW583807 Hs.1460 
R21135 



gb:61A12 Human retina cDNATsp509Uleav 



' 1500122C*:gi|445955Elg [AAD21311 | (AF 
ESTs 

pept'deYY,2(seminalplasmin) 
ESTs 

Homo sapiens cDNA FLJ11086 fis, clone PL 
TATA box binding protein (TBP)-associate 
Homo sapiens pyruvate dehydrogenase kina 
sapiens cDNA FLJ13903 fis, clone TH 



437330 AL353944 Hs.50115 

452784 BE463857 Hs.151258 

410037 AB020725 Hs.58009 

449145 AI632122 Hs.198408 

452487 AW207659 Hs.6630 



C1 8000743*.gil6678363|ref|NP_03341 6.1 1 1 
C22000452".gi|6981522|ref|NP_036781 ,1| r 
paired basic amino acid cleaving system 
Homo sapiens, done IMAGE:3929520, mRNA 
ESTs 

Homo sapiens mRNA; cDNA DKFZp761J11 12 (f 
hypothetical protein FLJ21062 
KIAA091 8 protein 



AA830335 Hs.1 05273 ESTs 
Hs.92423 



nscDNAFLJ13329 fis, done OV 



427209 I 

434280 BE0C5398 

418236 AW994005 Hs.337534 

429201 X03178 Hs.198246 

416653 AA768553 Hs.193145 

422501 AA354690 Hs.144967 

425087 R62424 Hs.126059 

426798 AA385062 lls.130260 

443798 R07848 Hs.188522 

427254 AL121523 Hs.97774 

431657 AI345227 Hs.105443 

40SS63 M133690 Hs.250857 

446005 NM_004403 Hs.13530 

418259 AA215404 

410173 AA706017 Hs.119944 

436023 T81819 Hs,302251 

448428 AF282874 lls.21201 

430665 BE350122 Hs.157367 

432559 AW452948 Hs.257631 

451572 AA018556 Hs.268691 

456032 AW957446 Hs.301711 

438209 AL1 20659 Hs.6111 

438337 AK002058 Hs.6166 

431795 AK002088 Hs.270124 

421114 AW975051 Hs.293156 

431843 AA516420 

440948 AW188311 Hs.128619 

430105 X70297 Hs,2540 

439046 AA947354 



ESTs, Weakly similar to B34087 hypolheti 



deafness, autosomal dominant 5 



ESTs, Moderately similar to ALU2J IUMAN A 
ESTs 

aryl-hydrocarbon receptor nuclear transl 

hypothetical protein FLJ1 11 96 

Homo sapiens cDNA FU11226 fis, done PL 

ESTs, Weakly similar to I78885 serine/th 

ESTs, Weakly similar to I38022 hypotheti 

ESTs 

cholinergic receptor, nicotinic, alphap 
gb:od86e11,s1 NCI_CGAP_Ov2 Homo sapiens 
Homo sapiens cDNA FLJ13741 fis, clone PL 



i AW081625 Hs.242561 ESTs 
i AI924228 Hs.115185 ESTs, Moderately similar to PC4259 ferri 



AW970386 Hs.269423 ESTs 

AA678267 Hs.117115 ESTs 

i BE207568 Hs,208219 oculospj 

i AI791949 Hs.1 12432 anti-Mull 

! AW854339 Hs.33476 

AW997484 Hs,5003 

N20617 Hs.194397 

AB023197 Hs.227743 

AI623817 Hs.168457 

AW963705 113,301183 



AA709186 Hs.99070 

AL050367 Hs.66762 

NMJ02742 Hs.2891 

AA324057 Hs.77955 
AL1 33561 

A1298501 Hs.12807 

AK00C401 Hs,252748 

A/476139 Hs.13291 

AA033832 Hs.212433 

AA430348 Hs.317596 



Homo sapiens mRNA; cDNA DKFZp564A026 [fr 
protein kinase C, mu 

Homo sapiens cDNA: FLJ23527 fis, clone L 
DKFZP434B061 protein 
ESTs, Weakly similar to T46428 hypotheti 
Homo sapiens cDNA FLJ20394 fis, clone KA 



io sapiens cDNA FLJ12927 fis, clone NT 



'o-lo-hi-hi 
ro-lo-hi-hi 
:o-lo-hi-hi 

lo-lo-hi-hi 
lo-lo-hi-hi 
lo-lo-hl-hi 



hi-ni-lo-lo 
hi-hi-lo-lo 
hi-ni-lo-lo 
hi-ni-lo-lo 



hi-hi-lo-lo 
hj-hi-c-lo 
hi-hi-lo-lo 
hi-h'-lo-lo 
hi-hi-lo-lo 
hi hi-lc-lo 
hi-hi-'o-lo 
hi-rv-'.o-lo 
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414483 


R25513 


Hs.10683 


ESTs 


hi-hi-lo-lo 


451273 


NMJ14811 


Hs.26163 


KIAA0649gene product 


hi-hi-lo-lo 


437052 


AA861697 


Hs.120591 


ESTs 


hi-ni-lo-lo 


440049 


R06699 




hypothetical protein MGC4174 


hi-hi-lo-lo 


429483 


AA974832 


Hs.128708 


ESTs 


hi-ni-lo-lo 


411296 


BE207307 




growth suppressor 1 


n'-hi-lo-lo 


425188 


AK002052 


Hs.155071 


hy| oil) lii hI f rotem FLJ11190 


hi-ni-lo-lo 


436315 


BE390513 


Hs.27935 


i |j hM 1 rotem MGC4837 






Al 127076 


Hs.306201 


hypothetical protein DKFZp56401278 




431089 


BE041395 




ESTs, Weakly similarto unknown protein 


hi-ni-lo-lo 


418824 


AW751661 


Hs.53542 


choreoacanthocytosis gene; KIM0986 prot 




449226 


AB002365 




KIAA0367 protein 


hi-ni-lo-lo 


450149 


AW969781 


Hs!l32863 


Zic family member 2 (odd-paired Drosophi 




418443 


NM 005239 


Hs.85146 


v-ets avian erythroblastosis virus E26 o 




458692 


BE549905 


231 


ESTs 




410102 


AW248508 


Hs.279727 


ESTs; homologue of PEM-3 [Ciona ssvig nyi 




451062 


AL110125 


Hs.25910 


Homo sapiens mRNA; cDNA DKFZp554C1416 (f 


hi-ni-lo-lo 


407633 


NM_007069 


Hs.37189 


similar to rat H REV1 07 


hi-hi-lo-lo 


418941 


AA452970 


Hs.239527 


E1B-55kDa-associated protein 5 




407059 


X95406 




gb:H.sapienscyolinEgene. 


hi-hi-lo-lo 


455956 


BE162704 




gb:PM1-HT0454-3C1299-001-dC8 HT0454 Homo 


hi-hi-:o-lo 


437763 


AA469369 


Hs.5831 


tissue inhibitor of metalloproteinase 1 




451404 


AA460775 


Hs.6295 


ESTs, Weakly similarto T17248 hypotheti 


hi-hi-lo-lo 


428494 


AA233439 


Hs.184634 


hypothetical protein FLJ20005 


hi-hi-'o-lo 


414957 


D61283 


Hs.45206 


ESTs 


hi-hi-lo-lo 


456415 


AI734051 


Hs.277102 


ESTs, Weakly similarto ALU1 HUMAN ALU S 


hi-hi-lo-lo 


400183 






Eos Control 


hi-hi-lo-lo 


400158 






ENSP00000244302*:CDNAFLJ115Q1 lis An 


hi-hi-lo-lo 


403893 






ENSP00000237068*:Protocadherin alpha 6 p 


hi-hi-'o-lo 


423809 


AI223833 


Hs.1 54483 


ESTs 


hi-h;-lo-lo 


400170 






Eos Control 


hi-hi-lo-lo 


403291 








hi-hi-lo-lo 


422026 


U80736 


Hs. 110826 


trinucleotide repeat containing 9 


hi-hi-lo-lo 


417130 


AW276858 


Hs.81256 


S100 calcium-binding protein A4 (calcium 


hi-hi-lo-lo 


432472 


AA548781 


Hs. 136418 


ESTs 


hi-hi-lo-lo 


405231 






C2001 066:gi|1 0257425|ref|NP_033892. 1 1 CD 




400141 






Eos Control 


hi-hi-lo-lo 


428971 


BE278404 


Hs.285813 


hypothetical protein FLJ11807 


hi-hi-lo-lo 


422390 


AW450893 


Hs.121830 


ESTs, Weakly similar to T42682 hypotheti 


hi-hi-lo-lo 




BE270918 


Hs.1 64026 


Homo sapiens, clone IMAGE:3534375, nRNA, 




456972 


AI054347 


Hs.2017 


ribosoma] protein L38 


hi-hi-lo-lo 




AF205849 




Kruppel-like factor 2 (lung) 






AI568453 




ESTs Weakly similar to CNIH HUMAN Rl 


hi-hi-lo-lo 


448439 


BE613082 


Hs.28229 


ARG99 protein 


hi-hi-lo-lo 


445418 




Hs.1 271 79 


cryptic gene 


hi-hi-lo- o 


402559 


Z23024 




Rho GTPase activating protein 1 




402575 


Z23024 




Rho GTPase activating protein 1 






AA807544 




F i i I B3432 , hi 


hi-hi-lo-lo 


446627 


AI973016 


Hs.1 5725 


hypothetical protein SBBI48 


hi-hi-lo-lo 








Eos Control 




430289 


AK001952 


Hs.238039 


hypothetical protein FIJ1 1090 


hi-hi-lo-lo 


400133 






Eos Control 


hi-hi-lo-'o 


418816 


T29621 


Hs.88778 


carbonyl reductase 1 




433579 


BE264473 


Hs.284297 


hypothetical protein from EUROIMAGE 1967 


hi-hi-lo-lo 


401952 






Target Exon 


hi-hi-lo-lo 


410349 


AW663021 


Hs.323445 


ESTs, Weakly similar to T2D3_HUMAN TRANS 


hi-hi-lo-lo 


417558 


AF045229 


Hs.82280 


regulator of G-protein signalling 1 0 


hi-hi-lo-'o 


446851 


AW007332 


Hs.1 0450 


Homo sapiens cDNA; FU22063 fi s, clone H 


hi-hi-lo-lo 


404489 






Target Exon 


hi-ni-lo-lo 


405802 






Target Exon 


hi-hi-lo-lo 


456266 


L29073 


Hs.198726 


cold shock domain protein A 


hi-hi-lo-lo 


457133 


M54968 




. Ki-ra 2 Kirstcn rat sarcoma 1 viral on 


hi-hi-lo-lo 


459330 


C16931 




gb:C16931 Clontech human aorta polyA mRN 


hi-hi-lo-lo 


433041 


BE265848 


Hs.289080 


colon cancer-associated protein Mid 


lo-!o-lo-n' 


446545 


AI431798 


Hs.1 641 92 


ESTS, Weakly similar to Y1 61 JHUMAN HYPOT 


lo-lo-lo-hi 


414911 


NR.000107 


Hs.77502 


damage- specific DNA binding protein 2 (4 


lo-lo-lo-hi 


414682 


AL021154 


Hs.76884 


inhibitor of DNA binding 3, dominant neg 


lo-lo-lo-hi 


422311 


AF073515 


Hs.1 14948 


cytokine receptor-like factor 1 


lo-lo-lo-hi 


447329 


BE090517 




ESTs, Moderately similar to ALU8.HUMAN A 


lo-lo-lo-hi 


412942 


AL1 20344 


Hs.75074 


mitog en-activated protein kinase-aclivat 


lo-lo-lo-hi 


420747 


BE294407 


Hs.99910 


phosphofruclokinase, platelet 


lo-lo-lo-hi 


431912 


AI660552 


Hs.76549 


ESTs, Weakly similar to A561 54 Abl subst 


lo-lo-lo-hi 


446506 


AI123118 


Hs.1 5159 


chemokine-like factor, alternatively spl 


lo-lo-lo-hi 


408633 


AW963372 


Hs.46677 


PRO2000 protein 


lo-lo-lo-hi 


433675 


AW977553 


Hs.75319 


ribonucleotide reductase M2 polypeptide 


hi-lo-lo-hi 


424560 


AA158727 


Hs.1 50555 


protein predicted by clone 23733 


hi-io-io-ni 


425234 


AW152225 


Hs.165909 


ESTs, Weakly similar to I38022 hypotheti 




439815 


AA206079 


Hs.6693 


hypothetical protein FU20420 


hi-lo-lc-ni 


410174 


AA306007 


Hs.59461 


DKFZP434C245 protein 


hi-lo-lo-hi 


410442 


X73424 


Hs.63788 


propionyl Coenzyme A carboxylase, beta p 


hi-lo-lo-ni 


429190 


H18650 


Hs.92502 


ESTs 




423619 


T48691 


Hs.249159 


adrenergic, alpha-2A- receptor 


hi-lo-lo-hi 
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434203 
438461 
409142 
439574 
438182 
449103 
421059 
446939 
408576 
410073 
450912 
434701 



451831 
406776 
428157 



AW753676 Hs.39982 

R74441 Hs.117176 

AF151879 Hs.26706 

BE245374 Hs.27842 

AI963747 Hs.18573 

AW1 63267 Hs.1 06469 

NMJ14641 Hs.277585 

NM 003816 Hs.2442 

BE262677 Hs.283558 
AW075485 
AL136877 
AI469788 

AW342140 Hs.182546 

T24968 Hs.23038 

AI654133 Hs.30212 

AL1 33353 Hs.15606 

NM_003542 Hs.45423 

AW408163 Hs.58488 

AW939251 Hs.25647 

AA460479 Hs.321707 

AL1 17424 Hs.25035 

AW956103 Hs.51712 

AI432163 Hs.268231 

NMJ01674 Hs.460 

T15206 Hs.237164 

AI738719 Hs.1 98427 

BE250162 Hs.83765 

X54942 Hs.83758 

H73444 Hs.394 

AA016188 Hs.111244 
.41270 

AW958613 Hs.79428 

AW582256 Hs.91011 




426108 
441181 
447397 
427505 



442799 
443881 
416209 
421834 



437679 
446636 
422094 



424840 
418216 
412140 



ESTs 

poly(A)-binding protein, nuclear 1 
CGI-121 protein 
hypothetical protein FLJ11210 
acylphosphatase 1, erythrocyte (common) 
suppressor of varl (S.cerevisiae) 3-like 
KIAA01 70 gene product 
dlsintegrin and metalloprotelnase doma 



CGI-32 protein 

H4 histone family, member G 

catenin (cadherin-associated protein), a 



Homo sapiens cDNA: FLJ23I11 (is, clone L 
"1 Hi nr^mlaTtoLrjHHJUIulANL-LAC 



X76732 Hs.3164 

AI984625 Hs.9884 

AA622037 Hs.1 66468 

AA416925 Hs.121076 

BE247676 Hs.1 8442 

AA361562 Hs.178761 

AW1 82459 Hs.1 25759 

AA866115 Hs.127797 

M81933 Hs.1 634 

AK002011 Hs.37558 

BE258532 Hs.251871 

U75679 Hs.75257 

AI564739 Hs.68505 

R64512 Hs.237146 

AA236776 Hs.79078 

BE543205 Hs.288771 

BE297802 Hs.69360 

AL11S964 F.S.75516 

AF151076 Hs.25199 

BE264974 Hs.6533 

AF062649 Hs.252587 

AF098158 Hs.9329 

C18825 Hs.29191 

Z17805 Hs.93554 

T55979 Hs.1 15474 

BE267931 Hs.78996 

AW975531 Hs.154443 

U46258 Hs.339665 

NM .014214 Hs.5753 

AC002563 Hs.15757 

AF129535 Hs.272027 

BE276112 Hs.7155 

H83363 Hs.6820 

NM_001809 Hs.1 594 

AI393122 Hs.134726 

AW411307 Hs.114311 

D79987 Hs.153479 

AA662240 Hs.283099 

AA219691 Hs.73625 



hypothetical protein 

procollagen-lysine, 2-oxoglutarate 5-dio 
BCL2/adenovirus E1B 19kD-interacting pro 
anterior gradient 2 (Xenepus laevis) horn 
hypothetical protein PRO2013 
ferritin , light polypeptide 
KIAA0264 protein 
pre-B-ccll colony-enhancing factor 
nucleolar protein p40; homolog of yeast 
tfiioredoxin-like 
activatinc transcription factor 7 
Target CAT 
hypothetical protein 
hypothetical protein MGC4399 
nucleobindin 2 
spindle pole body protein 
programmed cell death 5 
peptidylprolyl isomerase (cyclophilin)-l 
E-1 enzyme 



Weakly similar to LEU5.HUMAN LEUKE 
Homo sapiens cDNA FIJ11381 fis, clone HE 
cell division cycle 25A 
hypothetical protein FU1 1149 
CTP synthase 

stem-loop (histone) binding protein 
ESTs 

hypothetical protein FLJ12752 
MAD2 (mitotic arrest deficient, yeast, h 
DKFZP586A0522 protein 
kinesir.-like 6 (mitotic cei ' 



hypothetical protein 
thyroid hormone receptor intoractor 13 
pituitary tumor-transforming 1 
chromosome 20 open reading frame 1 
epithelial membrane protein 2 
Homer, neuronal immediate early gene, 2 
replication factor C (activator 1 ) 3 (36 
proliferating cell nuclear antigen 

nam c-r.cr.co cc:ci:--| (S 



inositol(myo)-1(or 4)-monophosphalase 2 

citron (rho-interacting, serine/lhreonin 

F-box only protein 5 

zinc finger protein 259 

translocase of inner mitochondria: membr 

centromere protein A (17kD) 

ESTs 

45 (celi division cycle 45, S.cerevis 



hi-lo-lo-hi 
hi-lo-lo-hi 
hi-lo-lo-hi 



hi-lo-lo-hi 
hi-;o-io-hi 
hi-'o-lo-hi 



hi-'o-lo-hi 
hi-lo-lo-hi 
hi-lo-lo-hi 
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20 
25 



40 
45 



65 
70 
75 



418322 


AA284166 


Hs.84113 


cyclin-dependenl kinase inhibitor 3 (CDK 


hi-lo-Io-hi 


428479 


Y00272 


Hs.334562 


cell division cycle 2, G1 to Sand G2to 


hi-lo-io-hi 


449722 


BE280074 


Hs.23960 


cyclin B1 


hi-lo-Io-hi 


417933 


X02308 


Hs.82962 


thymidylate synthetase 


hi-lo-Io-hi 


433001 


AF217513 


Hs.279905 


clone HQ0310PRO0310p1 


hi-lo lo-hi 


413943 


AW294416 


Hs.144687 


Homo sapiens cDNA FLJ12981 fis, clone NT 


hi-lo-Io-hi 


424905 


NMJ02497 


Hs.153704 


NIMA {never in mitosis gene a)-related k 


hi-lo-Io-hi 


422765 


AW409701 


Hs.1578 


baculoviral IAP repeat-containing 5 (sur 


hi-lo lo hi 


425397 


J04088 


Hs.156346 


topoisomerase (DNA) II alpha (170kD) 


hi lo-lo-hi 




BE540274 






422956 


BE545072 


Hs 122579 


ECT2 protein (Epithelial cell Iransformi 




444783 


AK001468 


Hs.62180 


anillin (Drosophlia Scraps homolog), act 


hi-lo-Io-hi 


453884 


AA355925 


Hs.36232 


KIAA0186gene product 


hi-lo-Io-hi 




AA381133 


Hs.80684 


high-mobiiity group (nonhistone chromoso 


ni-lo-lo-hi 


442432 


BE093589 


Hs.38178 


hypothetical protein FLJ23468 


hUo-lo-hl 


417308 


H60720 


Hs.81892 


KIAA0101 gene product 


hi-lo-Io-hi 


433133 






PDZ-binding kinase; T-cell originated pr 


ni-lo-lo-hi 


432626 




Hs.278544 


acetyl-Coenzyme A acetyltransferase 2 {a 


ni-lo-lo-hi 


441020 


W79283 


Hs.35962 


ESTs 


ni-lo-lo-hi 


412281 


AI810054 




ESTs 




435602 


AF217515 


Hs.283532 


uncharacterized bone marrow protein BM03 




400882 






Target Exon 


hi-lo-Io-hi 


446269 


AW263155 


Hs.14559 


hypothetical protein FLJ10540 


hi-lo lo-hi 




AI521558 


Hs.7331 


hypothetical protein FLJ22316 


hi-lo-Io-hi 


400881 






NM 025080:Homo sapiens hypothetical prol 


hi-lo-Io-hi 


419356 


AI656166 


Hs.7331 


hypothetical protein FLJ22316 


hi-lo-lo-hi 


400292 


AA250737 


Hs.72472 


BMP-R1B 


hi-lo-lo-hl 


415539 


AI733881 


Hs.72472 


BMP-R1B 


hi-lo-Io-hi 


453935 


AI633770 


Hs.42572 


ESTs 


hi-lo-lo-hi 


420005 


AW271106 


Hs.1 33294 


ESTs 


hi-lo-Io-hi 


428450 


NM_014791 


Hs.1 84339 


KIAA0175 gene product 




436291 


BE568452 


Hs.344037 


protein regulator of cytokinesis 1 


hi-lo-lo-hi 


441362 


BE614410 


Hs.23044 


RAD51 (S. ceresrislae) homolog (Ecoll Re 


hi-lo-lo-hi 


428484 


AF104032 


Hs.1 84601 


solute carrier family 7 (cationic amino 


ni-lo-lo-hi 


418526 


BE019020 


Hs.85838 


solute carrier family 1 6 (monocarboxylic 


hi-lo-lo-hi 


458809 


AW972512 


Hs.20985 


sin3-assoclated polypeptide, 30kD 


hi-lo-lo-hi 


444984 


H15474 


Hs.1 32898 


fatty acid desaturase 1 


h -lo-lo-hi 


447342 


AI199268 


Hs.1 9322 


Homo sapiens, Similar to RIKEN oDNA 2010 


h-ni-lo-lo 


428330 


L22524 


Hs.2256 


matrix metalloproteinase 7 (matrilysin, 


hi-hi-lo-lo 


428336 


AA503115 


Hs.1 83752 


microseminoprotein, beta- 


hi-hi-lo-lo 


430389 


AL1 17429 


Hs.240845 


DKF2P434D146 protein 




417318 


AW953937 


Hs.240845 


ESTs 


hi-hl-lo-lo 


.122543 


X02761 




fibronectin 1 


hi-hi-lo-lo 


417640 


D30857 


Hs!82353 


protein C receptor, endothelial (EPCR) 


hi-lo-lo-lo 


422809 


AK001379 


Hs.1 21 028 


hypothetical protein FLJ10549 


hl-lo-lo-hi 


425580 


L11144 


Hs.1 907 


galanin 


hi-:o-lo-hi 


416836 


D54745 


Hs.80247 


cholecystokinin 


hi-.o-lo-hi 


434170 


AA626509 


Hs.1 22329 


ESTs 


hi-'o-lo-hi 


427958 


AA418000 


Hs.98280 


potassium intermediate/small conductance 


hi-lo-lo-hi 




AW872527 


Hs.59761 


ESTs, Weakly similar to DAP1 HUMAN DEATH 




450088 


AW292933 


Hs.254110 


ESTs 


hi-lo-lo-hi 


414219 


W20010 


Hs.75823 


ALU-fused gene from chromosome 1q 


hi-lo-lo-hi 


419201 


M22324 


Hs.1 239 


alanyl (membrane) aminopeptidase (aminop 


hi-lo-lo-hi 


426263 


AI908774 


Hs.259785 


carnitine palmitoyltransferase 1, liver 


hi-lo-lo-hi 


456236 


AF045229 


Hs.82280 


regulator of G-protein signalling 1 0 




456607 




Hs.1 06070 


cyclin-dependent kinase inhibitor 1 C (p5 




408437 




Hs.278469 




hi-lo-lo-hi 


421180 


BE410992 


Hs.258730 


heme-regulated initiation factor 2-alpha 


hi-lo-lo-hi 


413437 


BE313164 


Hs.75361 


gene from NF2/meningioma region of 22q12 


hi-lo- : o-hi 


432415 


T16971 


Hs.289014 


ESTs, Weakly similar to A43932 mucin 2 p 


hi-lo-o-hi 


449230 


BE613348 


H;...21 1579 


melanoma cell adhesion molecule 


hi-lo-lo-hi 


417979 


AU077284 


Hs.83081 


GTP cyclohydrolase I feedback regulatory 


hi-lo-lo-hi 


421877 


AW250380 


Hs.1 09059 


mitochondrial rlbosomal protein U 2 


hi-lo-o-hi 


412482 


AI499930 


Hs.334885 


mitochondrial GTP binding protein 


hi-lo-lo-hi 


428423 


AU076517 


Hs.1 84276 


solute carrier family 9 (sodium/hydrogen 


hi-lo-lo-hi 


422947 


AA306782 


Hs.122552 


G-2 and S-phase expressed 1 


hl-lo-lo-hl 


441072 


AW275480 


Hs.39504 


hypothetical protein MGC4308 


hi-lo-lo-hi 


415938 


BE383507 


Hs.78921 


A kinase (PRKA) anchor protein 1 
hypothetical protein FLJ23563 


hi-lo-o-hi 


432278 


AL1 37506 


Hs.274256 






AA393907 








431515 


NM 012152 


Hs.258583 


endothelial differentiation, lysophospha 




445345 


AW003850 


Hs.12532 




hi-lc>lo-hi 


458965 


AA010319 


Hs.60389 


ESTs 


hi-lo-lo-hi 


438321 


AA576635 


Hs.6153 


CGI-48 protein 


hi-lo-lo-hi 


416783 


AA206186 


Hs.79889 


monocyte to macrophage differentiation-a 


hi-lo-lo-hi 


453563 


AW608906 


Hs.181163 


hypothetical protein MGC5629 


hi-lo-lo-hi 


432393 


AW205863 


Hs.1 33988 


hypothetical protein FKSG28 




433914 


AF108138 


Hs.1 12160 


Homo sapiens DNA helicase homolog (P1F1) 




414907 


X90725 


Hs.77597 


polo (Drosophia)-like kinase 


hh'o-:'o-hi 


432375 


BE536069 


Hs.2962 


S100 calcium-binding protein P 




440773 


AA352702 


Hs.37747 


Homo sapiens, Similar to RIKEN cDNA 2700 


hi-lo-lo-hi 


415994 


NM 002923 


Hs.78944 


regulator of G-protein signalling 2,24k 


hi-lo-lo-hi 
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412722 


AI343300 


Hs.15091 


ESTs 


hi-lo-lo-hi 


446839 


BE091926 


Hs. 16244 


mitotic spindle coiled-coil related prat 


hi-!o-lo-hi 


428862 


NMJ00346 


If .Mo 


SRY (sex determining region Y)-box 9 (ca 


hi-lo-lo-h! 


439108 


AW1 63034 




synaptogyrin 3 


hi-lo-lo-hi 




AW449612 


Hs.152475 


ESTs 


hi-lo-lo-hi 


421733 




Hs.1420 


fibroblast growth factor receptor 3 (acti 


ni-lo-lo-N 


452410 






Homo sapiens mRNA; cDNA DKFZp434E2321 (f 


hi-'o-lo-hi 


430132 


AA204686 


Hs.234149 


hypothetical protein FLJ20647 


ni-lo-lo hi 




AA236291 


Hs.1 83583 


serine (or cysteine) proteinase Inhlbito 




413142 


M81740 


Hs.75212 


ornithine decarboxylase 1 


ni-io-lo-hi 


427239 


BE270447 


Hs.1 74070 


ubiquifin carrier protein 




409738 


BE222975 


Hs.56205 


insulin Induced gene 1 


n : -lo-lo-hi 


410748 


BE383816 


Hs 1 25 


chromosome 1 open reading frame 21 




424506 


AF220490 


Hs. 149623 


group III secreted phospholipase A2 




447333 


BE090580 


Hs.70704 


hypothetical protein dJ616B8.3 


ni-lo-lo-hi 


414761 


AU077228 


Hs.77256 


enhancer of zeste (Drosophila) homolog 2 


ni-lo-lo-hi 


419602 


AW24B434 


Hs.91521 


hypothetical protein 


ni-lo-lo-hi 




BE612676 


Hs.303116 


stromal cell-derived factor 2-like 1 


ni-lo-lo-hi 


452322 


BE566343 


Hs.28988 


glularedoxin (thiollransferase) 


hi-lo-lo-hi 


426006 


R49031 


Hs.22627 


ESTs 


hi lo-lo-hi 




AW301344 


Hs.1 22908 
Hs.1 82265 


DNA replication factor 
keratin 19 


ni-lo-lo-hi 


406867 
407230 


AA157857 


Hs.1 82265 


keratin 19 


hi-lo lo hi 


446681 


AJ003624 


Hs.1 5896 




ni-lo-lo-hi 


408493 


BE206854 


Hs.46039 


phosphoglycerate mutase 2 (muscle) 


ni-lo-lo-hi 


439186 




Hs.1 05435 


GDP-mannose 4,6-dehydratase 


hi-'o-lo-hi 


424544 


M88700 


Hs.1 50403 


dopa decarboxylase (aromatic L-amino aci 


hi-'o-lo-hi 


431325 


AW026751 


Hs.5794 


ESTs, Weakly similar to 21 09260A 3 cell 


hi-lo-lo-hi 


414922 


D00723 


Hs.77631 


glycine cleavage system protein H (amino 




438291 


BE514605 


Hs.289092 


Homo sapiens cDNA: FLJ22380 fis, clone H 




418574 


N28754 




M-phase phosphoprolein 9 


hi-lo-lo-hi 

hi-lo-lo-hi 


409342 


AU077058 


Hs.54089 


SRCA1 associated RING domain 1 


'•32731 


AA837396 


Hs.263925 


LIS1-interactlng protein NUDE1, rat homo 


hi-lo-lo-hi 


436087 


BE300296 


Hs.5054 


CGI-133 protein 


hi-lo-lo-hi 


420309 


AW043637 


Hs.21766 


ESTs, Weakly similar to ALU5 HUMAN ALU S 


hi-'o-lo-hi 




AI418609 


Hs.71040 


hypothetical protein FLJ20425 


hi-lo-lo-hi 


424381 


AA285249 


Hs.146329 




hi-lo-lo-hi 


442547 


AA306997 


Hs.217484 


ESTs Weakly similar tr ALU1_HUMAN UU S 


ni-lo-lo-hi 


430376 


AW292053 


Hs.1 2532 


chromosome 1 open reading frame 21 


hMo-lo-hi 


«4685 


AF151103 


Hs.1 12259 


T cell receptor gamma locus 


hi-lo-lo-hi 


412330 


NMJJ051GC Hs.788 


A kinase (PRKA) anchor protein (grain) 


m-Wo-hi 


452123 


AI267615 


Hs.38022 


ESTs 


ni-lo-lo-hi 


424893 


AW295112 


Hs.1 53648 


Homo sapiens cDNA FLJ13303 f:s, clone OV 


ni-lo-lo-hi 


428057 




Hs.1 85798 


ESTs 


ni-lo-lo-hi 


431566 


AF176012 


Hs.260720 


J domain containing protein I 


ni-lo-lo-hi 


439979 


AW600291 


Hs.6823 


hypothetical protein FLJ10430 


ni-lo-lo-hi 


418836 


AI655499 


Hs.1 31 712 


ESTs 


ni-lo-lo-hi 


433757 


AI949974 


Hs.1 52670 


ESTs 


n -lo-lo-hi 


<I2523i3 


AW067800 


Hs.1 55223 


stanniocalcin 2 


ni-lo-lo-hi 


426215 


AW963419 


Hs.1 55223 


stanniocaicin 2 


ni-lo-lo-hi 
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TABLE 2B 
Pkey: Unique Eos probeset ide 



Pkey CAT Number 
408650 107294.1 
409051 109699J 



410216 1184664. 

410451 1204118. 

410498 120611J 

411053 1230446. 

411233 1236369. 



412419 1293418. 

412492 130082_1 

412657 1318507 

413351 1363660. 

413509 1374313. 

413672 1382512. 

415308 1533673 

415516 1539185. 

416508 1597894. 

416631 1605019. 

416954 163427.1 

417314 1666649.1 

418056 171841.1 

418259 173388.1 

418574 17690.1 

419555 185884.1 

420811 196677.1 

421911 208937.1 

421974 209307.1 

422123 211994.1 

423028 224062.1 

423476 22861.1 

423895 233006.1 

424593 241234.1 

426074 246486.1 

425291 249618 1 

425980 258778.1 

426413 266650.1 

428181 287953.1 

429163 300543.1 

429540 305828.1 

430068 312849.1 

430103 313089 1 

430439 31808.1 

431843 338324J 

432079 341114.1 

432340 345248.1 

432676 352582J 



435023 398093.1 

436716 425440.1 

436862 42814.2 

437576 43892.1 

438869 46651.1 

438882 466649.1 

438980 467544.1 

439046 468133.1 

439848 477806.1 

440151 487109.1 

440507 495677.1 



AA525775 AA056342 AI538978 AW975281 AA664986 

AA080912 AA07531 3 AA083403 AA076594 AA078992 AA084926 AA081 381 AA1 1 391 3 AA1 1 3892 AA083821 AA1 34801 AA082953 AA070343 

AA062B35 AA07541 9 AA063293 AA071252 M078900 AA062636 AW974305 

AA063403AA070823AA070050 

BE061839AW859863AW606085 

BE065687 BE065637 AW749002 H73690 

AA355749 M085520 AW966333 AA34031 9 BE1 70S36 

AW815061 H71965AW i I A -\ 4o ^o15041 AW815047 BE152831 EE152490 BE149043 BE149075 BE149035 BE149067 
AW833793 AW833799 AW833346 AW833371 AW833795 AW833562 AW833667 AW833377 
AW852754AW852897 AW852757 AW852617 BE172755AW835444 
BE181659AW890576AW857638 

AW994394 AW865900 AW865905 AW865891 AW866014 AW865898 

AW948630 AW948626 AW948634 AW94861 6 AW948627 AW94B61 5 AW94B631 AW948605 AW94861 1 AW94861 0 AW948633 AW948623 

AW948628 AW948604 AW948602 AW948607 

AW962604AA368639AA1 12257 

AW976165C04000 

BE086815 BE086823 R81218 R69229 

BE1 45419 BE1 45433 

BE156536 BE156439 BE156700 BE156449 BE156653 BE156533 BE156524 BE156670 BE156721 BE156723 
F05251 R13748 Z44028H14747 
F11411 R15237 Z43915 H 20760 



N68168N69188 N90450 

AA524886AW971347AA211537 

AA21 5404 A1990909 BE4641 32 AW271 459 N74332 AI262061 

N28754 N28747 AI5681 46 AJ979339 ,\A322671 /iA322672 AW355043 AI990326 AA77640S AI01 6250 AA843678 AW451882 N23137N23129 

W70051 AI038748 AA831 327 AI925845 AW945895 

M244416AA244401 

AA807544 AA280648 AI243056 AI022744 M705288 AA829425 AW452095 AI92931 7 R19039 AA282024 

AL041520 AA300086 

AA30 1 270 AA301 379 AA301 366 

AW881145AA490718 M85637AA304575T06067AA331991 

H90946 AA320597 AW954970 BE1 43680 

AL03563 i 1 1794 F1 1783 H16042 T660B9 H29379 R19493 AW134660 AI299437 AL133995AA057405 N7B357 AA917450 AI002692 T09262 

T65008 H29290 AI200874 M89441 5 AI732887 AI791 768 AI733447 AA988785 N621 28 T09261 AW956936 

AA332215AA403110AW965299 

AA343729AA345779AA344370 

AA495930 AI470890 H97831 AA350358 BE166712 

AA354572AW062361 AW813419AW816041 AI744949 

AA366951 AA470999 AA469425 

AA377823 AW954494 AI022688 

AA423976AA437075BE006469 

AA884766AW974271 AA592975 AA447312 

M85776 AA454535 AA456208 H90189 

AA464964 M85405 AA947566 

AA465259 AW897142 AW897144 

AL133561 AL041090AL1 17481 AL1 22069 AW439292 AI968826 
BE041 395 AA491826 AA621946 AA715980 AA666102 

AA516420C14818C14815C15161 C15068D80763 D60656AW970134AA543007D81004D60184AI49B371 D603B2D60181 C15876 
AW972746 AA525323 AI150314 
AA534222AA632632 T81234 
AI187366 AA558869 AA61 8478 

NM.002959 X98248 AA233278 AA846376 AI470aoJ \\ 05 BE3271 'W291971 AA017126 AI198417 AI365213AI168442AI337018 
AI475049 H85459 AA9S9895 AA888000 AA418326 AA41 3378 N71981 AL043634AA426361 AA418275AA232975 AL036861 BE277220 BE387505 
N9971 0 AW375004 AA418268 AL073 51 H8 C 4 A A 4A934366T92310AA405425AA421732AI656841 AW300968 

AW59341 8 T92267 BE464032 AW473548 AI359502 E E552306 AI9901 96 AW51 8351 AI239559 AW590963 AA0 1 8359 AI273737 AL042658 
AA411308AA402810H38111 AW013931 AW3RM \ i P4AI292020AI292121 AA340647 BE613672 BE409874 AA351915 

BE617026 BE019588 AW402692 AW247466 R59233 AA134761 BE254019 BE265105 D63316 BE313080 BE547713 BE536578 BE546749 
AA324185 H17386 BE253377 R87598 H29072 AA350980 BE076629 BE253957AA532613 BE252486AW804459 D30966 R87959 AA091832 
BE005398AA628622AA994155 
R76593AF147390R76594 

AI692552 AI393343 AI800510 AI37771 1 F24263 AA661 876 
AI433540 AA728984 AA804981 

AI821 940 N671 06 AI744264 AA808846 AA64341 7 AA643416 Z70715 

BE514383AA071273AW247987 AW673286BE312102AW749824BE071985AW577383BE071945 BE072005 AW577355 BE071965 AW239231 

BE072000 BE071960AW577360AW749830AW37r.T 9731 AW999 11 EE000192 BE562219 BE266655 BE264970 

AF075009 R63109 R63068 

AA827695 AA833754 AW978946 

AW502384 A1982587 AA828822 

M947354AA829660AI687296 

AW979249 D63277M846968 

AA868167 F21558 F31418 F35624 

H06994BE1 47898 
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444290 59994J 

444314 600667J 

445808 S5133 1 

447329 71 759 J 



452654 925931.1 

454775 1234106J 

455019 1249138J 

455272 1271871J 

455619 1346387J 

455653 1348742_1 

455729 1353792J 

455824 1372880J 

455956 1387163J 

456123 1534442J 

457133 29066J 



AA973905 AI299B88 AA917019 H63235 T90771 

AA974603AI984319AW340495 

AI038316A1344631 AI261653 

AA262496 AV648929 AA305356 D61 644 D78724 

AI140497 AW749625 AW749626 AW749644 

AV655234 AW966332 AA340239 

BE090517 AW970792 AW264490 AW0H985 F27436 AA947336 F15843 H89333 AA563626 F17712 BE546579 AA421821 AA284852 AA477751 
AW025245 

BE244285 C18429 H42373 AI820706 AI379786 R55439 AW2761 42 
AI472167AI990315 R32175 
AI523875 R45782 R45781 
AI554923 AI902356 
BE614081 W01988AW500790 

AL1 3361 9 AA4681 1 8 AA383064 AI476447 T09430 AI673753 AA524895 A1581 345 AI300820 AW49881 2 AA2561 62 AI559724 Al 685732 AA602400 
AA905453 AI204595 AW1 66541 AA1 57456 AA1 56269 AA383652 AA431 072 AW592707 AI43541 0 AW272464 AI215594 AA622747 R74039 
N35031 AI804128AW513621 AA868351 AI026826 AI493388 AA61 4641 W81604 AI567080 AI214351 M730140 AI125754 AI200813 AI269603 
AI565082 A1807095 AI476629 AA505909 AI368449 AI686077 AI532930 AW085038 M757863 AA730154 AI767072 AA468316 AI734130 AI734138 
AA426284 AA433997 AI741241 AW043563 AI732741 AI732734AA437369AA425820 AA664048 R74130 
BE144022 BE143969 BE14391S 
BE004783 BE004947 AI911790 

BE160229 AW819879 AW820179 AW819882 AW819876 AWB20169 BE153201 AW993736 BE15291 1 

AW850818AW850833AW851100 

BE1481 52 BE1481 33 BE148159 BE1481 32 AW885107 



BE154075 BE153973 BE064861 BE153852 BE153847 BE064684 BE153602 BE065075 BE154018 BE064772 BE064842 BE153557 BE153509 
BE072092 BE072106 BE072086 BE072098 BE072103 
BE143703 BE143631 BE143629 BE143702 
BE162704 BE162705 BE162732 BE162702 BE162694 
R00602Z42921 F06132 

M54968 NMJ049B5 AI808924 AL135130 AW242010 AA476848 AI740449 M17087 K03210 M35505 M35504 L00049 AI1865B5 W35273 X01669 
X02825 W23635 AI5S4S20 AI539465 AA425263 AI469981 W21091 T28976 AW977922 BE550180 AW664973 AI148939 AW1 17295 AA81 1229 
AI343010AA766141 BE219368N95249AA280396AW504574AA232870AI770018AA262943AW450230AW362890AW609417AW499941 
AA425857 AW380665 AA830647 AA282180 T27356 H85307 t AB615 ' i t A356548 AA356410 AW860656 AWB60647 AW93B103 AW860649 
AI56701 6 N70374 AW474707 AA505084 AA0821 95 AW949515 AA361728 N 33863 AA41 1 821 AA401 640 AW594461 AL120766 AI500024 
AW771891 H84567 D51551 AA330460 R1 4184 AI301 529 N 64676 AV659559 A1697660 AI004579 AA287927 AW453052 AW601 642 AA676681 
AA737010 AA872481 AA281094AA564243 BE46495B BE0492B5 AW167917 AAB4391 6 AAE25301 AI015987 N25230 AIBB94B1 AW173466 
AA937541 AI334416AI676214A1281159AA55355SM582189M255527AW160515AA670007H08199AAB08271 AA281015W47527AA649252 
AI364302 AA889246 R40473 H02312 AA64B1 1 6 AA342730 AA243624 R9935 1 R41588 R49696 AAB54442 F01713 AA213BB5 AA721296 R79B33 
1 184241 R70668 H85554 AA223758 N95349 AI374913 AI30BBB3 AA015609 AA91B54B AI453E70 AA772321 AI692775AA195733 AI474563 
AW873048 A12091 33 AI028182 AI374920 AW572807 AA406223 AA833684 T97255 H69138 AA382906 AW1 1 91 62 N31974 AI89D584 N3941 8 
AA864877 AA679469 BE350651 N41020 AIO5091 5 F00075 AA864B78 N26970 AA82BB98 AW019991 AW796631 AW993262 N4B532 BE564662 
AV654063 AI754461 AW945712 C03269 AV655314 AV659070 AV659308 AV660435 1 1701 13 C05323 R91984 1 196949 AV658936 AV658879 
H691 37 AA384411 AA412584 C02749 7/32014 R58166 C05526 BE53B017 N24354 AA2B7991 N80109 F05452 R12740 H08297 AL13B354 
AW020801 BE178443 BE17801B BE178336 3E178350 BE176107 BE17B3B5 BE178215 BE178186 BE17B447 BE178352 BE178422 BE178424 
3C17B043 BC176093 BC17B4eC3C17tC- E1784'1 3E1784 3E178 7 1 > 1259 E ri778C9 3E178094R28455 BE177844 BE178100 
AA262387 R70669 W80934 W93668 AA25671 1 BE1781 41 BE177893 BE17B449 AA16771B H69694 BE178017 BE178029 BE177999 BE177936 
AA095 1 44 N 32462 AA28 1 203 AA2F 1 1 175 05015 R341 - 3 T17366 R79640 W25258 R99450 AW368425 BE178196R26447 
C03146C03683 

U25750AI792472AA487379A1872282AA487262R22383AI865750 R21832AA593628AW571869AA377191 R78814T27193 
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on entitled "The DNAsei 



es nucleotide positions of predicted exi 



400481 8439853 
400501 9796227 
400713 8118874 



Imposition 

112433-112541 

12479-12619 

43185-43394 

28671-29795 

172644-172765,173085-173200 
91446-91603,92123-92265 



20 
25 



401093 8516137 

401178 9438616 

401192 9719502 

401209 7712287 

401405 7768126 



401444 8346725 

401512 7622346 

401563 8247910 

401600 4388746 

401750 9828651 

401757 7239630 

401839 7656637 

401849 7770425 

401952 3319121 

401966 3126781 

402082 8117478 

402101 8117697 

402106 8131652 

402163 8568936 

402185 8576002 

402240 7690131 

402249 7704953 



402469 9797107 Minus 

402532 9800951 Minus 

402559 9864273 Plus 

402575 9884830 Minus 

402602 7239666 Plus 

402758 9213869 Plus 

402786 9715046 Plus 

402807 6456148 Minus 

402810 6010110 Plus 

402964 9581599 Minus 

403046 3540153 Minus 

403055 8748904 Minus 

403217 7630969 Plus 

403218 7630969 Plus 
403291 7230870 Plus 
403328 8469086 Minus 



164932-165112 
69276-69452,69 
121456-121626 
136389-136508 
90895-90994,93070-93213 



91395-91763 

27363-2751 8,28727-28891 ,29526-29731 

82143-a2270,89284-89373,90596-90770,95822-96001,96688-96775,96B70-96992,98046-9B138 
88641-88751 

1016-1086,2751-2967,3241-3348,26677-26831 
129375-129483,129597-129720 



190046-190183 

1 34308-1 34487, 135402-1 35587,1 36421 -1 36548 

3717-3848 

166996-167119 



104382-104527,106136-106372 

107636-107813,106694-106624,110435-110502,113162-113366 

13714-15440 

4426-4648 

71266-72351 

180240-180558 

33539-33715 

109742-109883 

6785-6972,7478-7575 



47624-47795 

42-101660,103476-103656 
' "27-13643 



55707-55859,56369-56511 

109532-110225 

54089-54163,55427-55623 



403725 7534031 

403739 7630882 

403740 7630882 
403745 7652036 



404054 3548785 

404058 3548785 

404108 8247074 

404211 5006246 

404277 1834458 

404384 8887028 

404407 7329316 

404489 8113772 

404527 8152087 



134394-134812 
86737-86843 

44563-44766,48209-48483,52255-52495 
86504-87227 



38657-38817 
81889-82011 
66713-69175 
99397-101808 



1 85728-1 85885,194575-194686 
91665-91946 

38055-38156,42175-42391,43435-43553 

48154-48499 

98183-98480 

127737-127796,128080-128210,129888-130054,13 



178 



WO 02/098358 



PCT/US02/ 17594 



8152087 
9797073 
9797133 
7387343 
6139150 
7596797 
8076881 



405188 I 

405231 ' 

405365 ; 

405387 i 

405396 i 

405429 ' 

405435 ' 

405446 ' 

405525 

405529 ! 

405610 ! 



117359-117612 
98903-101141 
120922-121296 
30301-30518 



119867-120372,120481- 
3769-3833,5708-5895 
89965-90273 
51577-51723 

51704-51841,53581-53767 



51198-51314 



168961-169150,169610-169769 
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Table 3A shows the Seq ID No, Pkey, ExAccn, UnlgenelD, and Unigene Title for all of the sequences in Table 4. 

Pkey: Unique Eos probeset identifier number 
ExAccn: Exemplar Accession number, 
UnigenelD: Unigene number 
Unigene Tille: Unigene gene title 
SeqIDNo: Seq ID number correlation 





































s. 






<!rj8660 


AA525775 
AA080912 




ESTs, Moderately similar to PC4259 ferri 
i ml' 1 Slratagene hNI urc 


Seq ID No 15 & 16 
Seq ID No 17 


409?23 


AA063403 




gb:zm04d12.s1 Stratagene corneal stroma 


Seq ID No 18 


415787 


H01463 


Hs.93534 


ESTs 


Seq ID No 19-21 


415999 


AA172179 


Hs.294029 




Seq ID No 22 


416225 
420757 


X78592 


Hs.1 88684 
Hs.99915 


ESTs, Weakly similar to PC4259 ferritin 
ndrogeni I d'hydrotestosterone r 


Seq ID No 23 
Seq ID No 24 & 25 


429163 


AA884766 




gb:am20a10.s1 Soares_NFL_T_GBCS1 Homos 


Seq ID No 26 


429441 


AJ224172 


Hs.204096 


lipophilin B (uteroglobin family member) 


Seq ID No 27 & 28 


431099 


Y13367 


Hs.249235 


phosphoinositide-3-kinase, class 2, alph 


Seq ID No 29 & 30 


432432 


AA541323 


Hs.1 15831 


ESTs 


Seq ID No 31 


432435 


BE218386 


Hs.282070 


ESTs 


Seq ID No 32 & 33 


432527 


AW975028 


Hs.1 02754 


ESTs 


Seq ID No 34 


435876 


AW612586 


Hs.1 60271 


G protein-coupled receptor 48 


Seq ID No 35 & 36 


4-8J33 


W52448 


Hs.56147 


ESTs 


Seq ID No 37-40 


439569 


AW6C2166 


Hs.222399 


CEGP1 protein 


Seq ID No 4U42 


440819 


AI809444 


Hs.202108 


ESTs 


Seq ID No 43 


442832 


AW206560 


Hs.253569 


ESTs 


Seq ID No 44 




AI199268 


Hs.19322 


Homo sapiens, Similar to RIKEN cDNA 2010 


Seq ID No 45 & 46 


447499 


AW262580 


Hs.1 47674 


protocadherin beta 16 


Seq ID No 47 & 48 


451411 




Hs.1 35655 


EST 


Seq ID No 49 


451720 


AW970985 


Hs.290853 


ESTs 


Seq ID No 50 & 51 
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Table 3B shows the accession numbers for those Pkey's lacking UnigenelD's for table 3A. For eacn probeset is listed gene cluster number from which oligonucleotides were 
designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using 
Clustering and Alignment Tools (DoubleTwist, Oakland California). Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" column. 

Pkey CAT Number Accession 

408660 107294J AA525775 AA056342 AI538978 AW975281 AA664986 

409051 109699J AA080912 AA075318 AA083403 AA076594 AA078992 AA084926 AAC81881 AA1 1 391 3 AA1 1 3892 AA083821 AA134801 M082953 AA070343 

AA062835 AA075419 AA063293 AA071252 M078900 AA062836 AW974305 
409123 110143J AA063403 AA070823 AA070050 
429163 300543J AA884766AW974271 AA5S2975AA447312 
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on numbsrs in table 3A. For each predicted exon is listed genomic sequence source 



403740 7630882 Plus 86504-87227 
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iC AGTGCGGAGA CCt 



A AGGAAGATCA TTTCATGCCT 
CATAACCATT TGGCTCTGAG CTATGACAAG 
TGCCATAAGT GAGAAGCAAA CTTCCTTGAT 
AATGTGGGCA CCAAGAAAGA GGATGGTGAG AGTACAGCCC 
A AATGCCACCA CCATTGTCCA 



GGTTGCCTAG GACTAGAAGG CTCAGATTTT CAGTGTCGGG ACACTCCCAT 
AGAAGAT C AA TTGAATGCTG CACAGAAAGG AACGAATGTA ATAAAGACCT 

T GATGGACCTA TACACCACAG GGCTTTACT 
G GTCCTTATCA TATTATTTTG TTACTTCCC 



GTGAAAGTGT TCTTCACCAC AGAGGAAGCC AGCTGGTTCA GAGAGACAGA AATATATCAG 

GGGTCCTGGA CCCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTTTATGAT 
TATCTGAAGT CCACCACCCT AGACGCTAAA TCAATGCTGA AGTTAGCCTA CTCTTCTGTC 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 
CATCGAGATC TGAAAAGTAA AAACATTCTG GTGAAGAAAA ATGGAACTTG CTGTATTGCT 
GACCTGGGCC TGGCTGTTAA ATTTATTAGT GATACAAATG AAGTTGACAT ACCACCTAAC 
ACTCGAGTTG GCACCAAACG CTATATGCCT CCAGAAGTGT TGGACGAGAG CTTGAACAGA 
AATCACTTCC AGTCTTACAT CATGGCTGAC ATGTATAGTT TTGGCCTCAT CCTTTGGGAG 
GTTGCTAGGA GATGTGTATC AGGAGGTATA GTGGAAGAAT ACCAGCTTCC TTATCATGAC 
CTAGTGCCCA GTGACCCCTC TTATGAGGAC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 
CCCCCCTCAT TCCCAAACCC GTCCAGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 
ATGACAGAAT GCTGGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GGTT.AAGAAA 
ACACTTGCCA AAATGTCAGA GTCCCAGGAC ATTAAACTCT GATAGGAGAG GAAAAGTAAG 
CATCTCTCCA CAAACCCAAC ACCTACTCTT CTCTTTCTCG CCAGACCAAA AGACATCAAA 
TAAGCATCCA CAGTACAAGC CTTGAACATC GTCCTGCTTC CCAGTGGGTT CAGACCTCAC 
CTTTCAGGGA GCGACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATCCGTG 
TCTGTTTGIA GGCGGAGAAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 

n sequence 



MLLRSAGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFTMIEED 

DSGLPWTSG CLGLEGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 

GPIHHRALLI SVTVCSLLLV LIILFCYFRY KRQETRPRYS IGLEQDETYI PPGESLRDLI 

EQSQSSGSGS GLPLLVQRTI AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVFFTTEEAS 

WFRETEIYQT VLMRHENILG FIAADIKGTG SWTQLYLITD YHENGSLYDY LKSTTLDAKS 

MLKLAYSSVS GLCHLHTEIF STQGKPAIAH RDLKSKNILV KKNGICCIAD LGLAVKFISD 

TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HFQSYIMADM YSFGLILWEV ARRCVSGGIV 

EEYQLPYHDL VPSDPSYEDM REIVCIKKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 
RLTALRVKKT LAKMSESQDI KL 

Seq ID NO: 3 DNA sequence 

Coding sequence: 55 .. 1575 



GAGCCGCGAC 



IGAGACTGCO 
GAGTAATTAT 

TGCGCCCTGG CATTGGAGAG AAGCCCACTG TGGTCACTGT TGAGATCGCC 
TTGGTCCTCT CTCTATCCTA GACATGGAAT ACACCATTGA C 
GGTACGACGA A 
TGGTGAGCCA G 

GGATGGCAAG 



CCAATGGATT CTCACTCTTG CCCTCTATCT TTCTCTAGCT TTTCCTATCC TGAGAA7GAG 



G ATTTTACAGG A 
GACTTCATGG TCATGACGAT T 
CAAAACTATG TCCCTTCTTC CGTGACCACG ATGCTCTCCT GGGTTTCCTT TTGGATCAAG 960 
ACAGAGTCTG CTCCAGCCCG GACCTCTCTA GGGATCACCT CTGTTCTGAC CATGACCACG 10 20 
TTGGGCACCT TTTCTCGTAA GAATTTCCCG CGTGTCTCCT ATATCACAGC CTTGGATTTC 10 80 
TATATCGCCA TCTGCTTCGT CTTCTGCTTC TGCGCTCTGT TGGAGTTTGC T' 
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TTCCTGATCT ACAACCAGAC AAAAGCCCAT GCTTCTCCTA AACTCCGCCA TCCTCGTATC 1200 

AATAGCCGTG CCCATGCCCG TACCCGTGCA CGTTCCCGAG CCTGTGCCCG CCAACATCAG 1260 

GAAGCTTTTG TGTGCCAGAT TGTCACCACT GAGGGAAGTG ATGGAGAGGA GCGCCCGTCT 132 0 

TGCTCAGCCC AGCAGCCCCC TAGCCCAGGT AGCCCTGAGG GTCCCCGCAG CCTCTGCTCC 1380 

AAGCTGGCCT GCTGTGAGTG GTGCAAGCGT TTTAAGAAGT ACTTCTGCAT GGTCCCCGAT 1440 

A GTACCTGGCA GCAGGGCCGC CTCTGCATCC ATGTCTACCG CCTGGATAAC 1500 



GGTCCAAGCC CCTTGCCAAG GGAGTTGGGG GAAAGCAGCA G Ai CAG i l^'CGACTAG 1680 

AGTTTTTCCT GCCCCATTCC CCAAACAGAA GCTTGCAGAG GGTTTGTCTT TGCTGCCCCT 1740 

CTCCCCTACC TGGCCCATTC ACTGAGTCTT CTCAGCAGAC CATTTCAAAT TATTAATAAA 1800 

TGGGCCACCT CCCTCTTCTT CAAGGAGCAT CCGTGATGCT CAGTGTTCAA AACCACAGCC 1860 

ACTTAGTGAT CAGCTCCCTA AAACCATGCC TAAGTACAGG CGGATTAGCT ATCTTCCAAC 192 0 



CTTTCGGCCC AGTTCTGGCC TCAGCCTCAA AGTGCACCGA CTAGTTGCTT GCCTATACCT 



CACTGGCATT ATCCCTTTAG GAAGAGGGGG GGGCAGCAAG AGAGCCTATT TGGGACAGCA 
TTCCTCTCTC TCTGCTGCTG TGACATCTCC CTCTCCTTGC T 
ACTACCAATT CAATGCCCTT C 



GCCAAGAAAC TAAGGAAACT CGGCTTTGCA ACAGGCATTA CTCGCCATTG ATTGGTGCCC 252 0 



ACCTTCTAGA 



3 CCCCCAAGAT CAAATCTCTC CTGGCTC-TAG TAACCCAGTG 3 000 

?GCTTCT ATATGCTAAG TGAAATCTGT GTCTGTAATT 3 0 SO 

T GGGGTCTCCA TCTACTTTTT GTCACCATCA TCTGAAATGG 3120 
GGAAATATGT AAATAAATAT ATCAGCAAAG CAAAAAGAAA AAAAAAAA 



GILLILQSRV EGPQTESKME ASSRDWYGP QPQPI^ 



KLFQFDFTGV SNKTEIITTP VGDFMVMTIF FNVSRRFGYV AFQNYVFS3V TTMLSWV5FW 
IKTHSAPART SLGITSVLTM TTLGTFSRKN FPRVSYITAL DFYIAICFVF CFCALLEFAV 
LNFLIYNQTK AHASPKLRHP RINSRAHART RARSRACARQ HQEAFVCQIV TTEGSDGEER 
PSCSAQQPPS PGSPEGPRSL CSKLACCEWC KRFKKYFCMV PDCEGSTWQQ GRLCIHVYRL 



Seq ID NO: 5 DNA sequence 
nucleic Acid Accession #: NM_021984.1 
Coding sequences 572.. 1753 

1 11 21 31 41 51 

I I I I I I 

GCCAGAGCGT GAGCCGCGAC CTCCGCGCAG GTGGTCGCGC CGGTCTCCGC GGAAATGTTG 
TCCAAAGTTC TTCCAGTCCT CCTAGGCATC TTATTGATCC TCCAGTCGAG AACATGTATA 
CAGAGAAGTG CTCAAATCAT AAGTGTACAG CTGATGAGTT GTCAAAAAAT C-ACCACAGCG 
GTGTAAAGAA AGCCAAATCA AGGACCCGAA TGTGAGCAGG ACCTCAGAAG CCCCCTTTGT 
CACTGCCTCC CAGCAAAGGC AGCACTATCC GGACTTCTAA CACCATCGGG TCGAGGGACC 
TCAGACTGAA TCAAAGAATG AAGCCTCTTC CCGTGATGTT GTCTATGGCC CCCAGCCCCA 
GCCTCTGGAA AATCAGCTCC TCTCTGAGGA AACAAAGTCA ACTGAGACTG AGACTGGGAG 
CAGAGTTGGC AAACTGCCAG AAGCCTCTCG CATCCTGAAC ACTATCCTGA GTAATTATGA 
CCACAAACTG CGCCCTGGCA TTGGAGAGAA GCCCACTGTG GTCACTGTTG AGATCTCCGT 
CAACAGCCTT GGTCCTCTCT CTATCCTAGA CATGGAATAC ACCATTGACA TCATCTTCTC 
CCAGACCTGG TACGACGAAC GCCTCTGTTA CAACGACACC TTTGAGTCTC TTGTTCTGAA 
TGGCAATGTG GTGAGCCAGC TATGGATCCC GGACACCTTT TTTAGGAATT CTAAGAGGAC 
CCACGAGCAT GAGATCACCA TGCCCAACCA GATGGTCCGC ATCTACAAGG ATGGCAAGGT 
GTTGTACACA ATTAGGATGA CCATTGATGC CGGATGCTCA CTCCACATGC TCAGATTTCC 
AATGGA1TCT CACTCTTGCC CTCTATCTTT CTCTAGCTTT TCCTATCCTG V3AATGAGAT 
GATCTACAAQ TGGGAAAATT TCAAGCTTGA AATCAATGAG AAGAACTCCT GGAAGCTCTT 
CCAGTTGGAT TTTACAGGAG TGAGCAACAA A 
CTTCATGGTC ATGACGATTT T 
AAACTATGTC CCTTCTTCCG TGACCACGAT GCTCTCCTGG GTTTCCTTTT GGATCAAGAC 1140 
AGAGTCTGCT CCAGCCCGGA CCTCTCTAGG GATCACCTCT GTTCIGACCA TGACCACGTT 1200 
GGGCACCTTT TCTCGTAAGA ATTTCCCGCG TGTCTCCTAT ATCACAGCCT TGGATITCTA 12 60 
TATCGCCATC TGCTTCGTCT TCTGCTTCTG CGCTCTGTTG GAGTTTGCTG IGCTCAACTT 13 20 
C AACCAGACAA AAGCCCATGC TTCTCCTAAA CTCCGCCATC CTCGTATCAA 1380 
2 CATGCCCGTA CCCGTGCACG TTCCCGAGCC TGTGCCCGCC AACATCAGGA 1440 
^AGATTG TCACCACTGA GGGAAGTG Q ^ -CGTCTTG 15 00 

CTCAGCCCAG CAGCCCCCTA GCCCAGGTAG CCCTGAGGG? CCCCGCAGCC TCTGCTCCAA 1560 
GCTGGCCTGC TGTGAGTGGT GCAAGCGTTT TAAGAAGTAC TTCTGCATGG TCCCCGATTG 1620 
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TGAGGGCAGT ACCTGGCAGC AGGCCCGCCT CTGCATCCAT GTCTACCGCC TGGATAACTA 1S8 0 

CTCGAGAGTT GTTTTCCCAG TGACTTTCTT CTTCTTCAAT GTGCTCTACT GGCTTGTTTG 1740 

CCTTAACTTG TAGGTACCAG CTGGTACCCT GTGGGGCAAC CTCTCCAGTT CCCCAGGAGG 1800 

TCCAAGCCCC TTGCCAAGGG AGTTGGGGGA AAGCAGCAGC AGCAGCAGGA GCGACTAGAG 1860 



TGCTGACCAC CAGACAATTA CTOOOTTTT CCAGAACCCC ACTATTCCCT T 

C CTATACCTGG 222 0 
3 TCCTTTGGTC 2280 



TACCAATTCA ATGCCCTTCA 
TTTCCCAGTG ACTTCCCCTA GCCCTGACCC 



CTGTTATACC CGGGGCACTC TAACCATCAC 

TAGCCTTGTG ACATCTTTAG 
GTCACAGATT TCTGTGGGAC 



A AGGCCCAGAA TGGCGACCTC TCTTTAGCTC AATTTCTGGG 
CCTGAGGTGC TCAGACTGCC CCCAAGATCA AATCTCTCCT GGCTGTAGTA ACCCAGTGGA 
ATGAATTTGG ACATGCCCCA ATGCTTCTAT ATGCTAAGTG AAATCTGTGT CTGTAATTTG 
TTGGGGGGTG GATAGGGTGG GGTCTCCATC TACTTTTTGT CACCATCATC TGAAATGGGG 



I I I I I I 

MEYTIDIIFS QTWYDERLCY NDTFBSLVLN GNWSQLWIP DTFFRNSK3T IIEI3EITMPNQ 

MVRIYKDGKV LYTIHMTIDA GCSLHMLiRFP MDSHSCPLSF SSFSYPENEM IYKWENFKLE 

INEKNSWKLF QLDFTGVSNK TEIITTPVGD FMVMTIFFNV SRRFGYVAFQ KYVPSSVTTX 

LSWVSFWIKT ESAPARTSLC ITSVLTMTTL CTFSRKNFPR VSYITALDFY IAICFVFCFC 

F LIYNQTKAHA SPKLRHPRIN SRAHARTRAR SRACARQHQE AFVCQIVTTE 

Z SAQQPPSPGS PEGPRSLCSK LACCEWCKRF KXYFCMVPDC EGSTWQQARL 

CIHVYRLDNY SRWFPVTFF FFNVLYWLVC LNL 

Seq ID NO-. 7 DNA sequence 

Nucleic Acid Accession #i NM_021987.1 

Coding sequence: 572.. 1657 

i i 1 r r i 1 r 

GCCAGAGCGT GAGCCGCGAC CTCCGCGCAG GTGGTCGCGC CGGTCTCCGC GGAAATGTTG 
Z TTCCAGTCCT CCTAGGCATC TTATTGATCC TCCAGTCGAG AACATGTATA 
AATCAT AAGTGTACAG CTGATGAGTT GTCAAAAAAT GACCACAGCG 
GTGTAAAGAA AGCCAAATCA AGGACCCGAA TGTGAGCAGG ACCTCAGAAG CCCCCTTTGI 
CACTGCCTCC CAGCAAAGGC AGCACTATCC GGACTTCTAA CACCATCGGG TCGAGGGACC 
TCAGACTGAA TCAAAGAATG AAGCCTCITC CCGTGATGTT GTCTATGGCC CCCAGCCCCA 
GCCTCTGGAA AATCAGCTCC TCICTGAGGA AACAAAGTCA ACTGAGACTG AGACTGGGAG 
CAGAGTTGGC AAACTGCCAG AAGCCTCTGG CATCCTGAAC ACTATCCTGA GTAATTATGA 
CCACAAACTG CGCCCTGGCA TTGGAGAGAA GCCCACTGTG GTCACTGTTG AGATCTCCGT 
CAACAGCCTT GGTCCTCTCT CTATCCTAGA CATGGAATAC ACCATTGACA TCATCTTCTC 
CCAGACCTGG AATTCTAAGA GGACCCACGA GCATGAGATC ACCATGCC 
CCGCATCTAC AAGGATGGCA AGGTGTTGTA CACAATTAGG AIGACCAT 
CTCACTCCAC ATGCTCAGAT TTCCAATGGA TTCTCACTCT TGCCCTCT 

T CCTGAGAATG AGATGATCTA CAAGTGGGAA AATTTCAAGC TTGAAATCAA 
C TCCTGGAAGC TCTTCCAGTI TGATTITACA GGAGTGf 
A ACCCCAGTTG GTGACTTCAT GGTCATGACG ATTTTCTTCA A 
GCGGTTTGGC TATGTTGCCT TTCAAAACTA TGTCCCTTCT TCCGTGACCA CGATGCTCTC 
CTGGGTTTCC TTTTGGATCA AGACAGAGTC TGCTCCAGCC CGGACCTCTC TAGGGATCAC 
G ACCATGACCA CGITGGGCAC CTTTTCTCGT AAGAATTTCC CGCGTGTCTC 
A GCCTTGGATT TCTATATCGC CATCTGCTTC GTCTTCIGCT TCTGCGCTCT 
GTTGGAGTTT GCTGTGCTCA ACTTCCTGAT CTACAACCAG ACAAAAGCCC ATGCTTCTCC 
TAAACTCCGC CATCCTCGTA TCAATAGCCG TGCCCATGCC CGTACCCGTG CACGTTCCCG 
AGCCTGTGCC CGCCAACATC AGGAAGCTTT TGTGTGCCAG ATTGTCACCA CTGAGGGAAG 
TGATGGAGAG GAGCGCCCGT CTTGCTCAGC CCAGCAGCCC CCTAGCCCAG G 
GGGTCCCCGC AGCCTCTGCT CCAAGCTGGC CTGCTGTGAG TGGTGC? 

C ATGGTCCCCG ATTGTGAGGG CAGTACCTGG CAGCAGGGCC G 
Z CGCCTGGATA ACTACTCGAG AGTTGTTTTC CCAGTGACTT TCTTCTTCTT 
CAATGTGCTC TACTGGCTTG TTTGCCTTAA CTTGTAGGTA CCAGCTGGTA CCCTGTGGGG 
CAACCTCTCC AGTTCCCCAG GAGGTCCAAG CCCCTTGCCA AGGGAGTTGG GGGAAAGCAG 
CAGCAGCAGC AGGAGCGACT AGAGTTTTTC CTGCCCCATT ICCAAACAG TT 
AGGGTTTGTC TTTGCTGCCC CTCTCCCCTA CCTGGCCCAT TCACTGAGTT TTCTCAGCAG 
ACCATTTCAA ATTATTAATA AATGGGCCAC CTCCCTCTTC TTCAAG 

CTCAGTGTTC AAAACCACAG CCACTTAGTG ATCAGCTCCC TAAAACCATG CCTAAGTACA 
GGCGGATTAG CTATCTTCCA ACAATGCTGA CCACCAGACA ATTACTGCAT TTTTCCAGAA 
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3 cagatagata gacactggca 
agagagccta tttgggacag cattcctctc tctctgctgc tgtgacatct 
:taccaa ttcaatgccc ttcatccaat 
a actactccct gctttatatg ccaccc 




CAATCAAATT CCCTTAAATT T 

r TGGAGCTTCA TGATAGCCTT G 

3 ATGAAAACCC TGAGTCACAG ATTTCTGTGG GACTGTGGAT CTCACTGGAA 

rCACTGT CACCTTCTAG ACCACATGAT AGGGCTAGAC AGCTCAGTTC 

C TCTTCTGTCA C 

CTCTCTTTAG CTCAATTTCT GGGCCTGAGG TGCTCAGACT GCCCCCAAGA T< 

CCTGGCTGTA GTAACCCAGT GGAATGAATT TGGACATGCC CCAAIGCTTC TATATGCTAA 3120 

GTGAAATCTG TGTCTGTAAT TTGTTGGGGG GTGGATAGGG TGGGGTCTCC ATCTACTTTT 31B0 

TGTCACCATC ATCTGAAATG GGGAAATATG TAAATAAATA TATCAGCAAA GC 



I 



I I 
S QTKNSKRTHE HEITMPNQMV RIYKDGKVLY T 

SHSCPLSFSS FSYPENEMIY KWENF KLEIN EKNSWKLFQF DFTGVSNKTE IITTPVGDFM 

VMTIFFNVSR RFGYVAFQNY VPSSVTTMLS WVSFWIKTES APARTSLGIT SVLTMTTLGT 

FSRKNFPRVS YITALDFYIA ICFVFCFCAL LEFAVLNFLI YNOTKAHASF KLRHPRINSR 

AHARTRARSR ACARQHQEAF VCQIVTTEGS DGEERPSCSA QQPPSPGSPE GPRSLCSKLA 

K YFCMVPDCEG STWQQGRLCI HVYRLDNYSR WFPVTFFFF NVLYWLVCLN 



Seq ID NO : 9 DNA sequence 

Nucleic Acid Accession #: NM_021990.1 

Coding sequences 1309.. 2490 



I 



I 



TCCAAAGTTC TTCCAGTCCT 
CTCAAATCAT 
AGCCAAATCA 
CACTGCCTCC CACCAAAGGC 
CCTTGGCAGA TGGCCTTTAA 
T1TTCTTGGC TGTGGTGCAT 
TCCTGGATGG CTGTCTGTGG 
GCTGCTCTTT AGCCTCCTTC 
AAAACCGCAA 



CTCCGCGCAG GTGGTCGCGC CGGTCTCCGC GGAAAIGTTG 
CCTAGGCATC TTATTGATCC TCCAGTCGAG AACATGTATA 
AAGTGTACAG CTGATGAGTT GTCAAAAAAT GACCACAGCG 
AGGACCCGAA TGTGAGCAGG ACCTCAGAAG CCCCCTTTGT 
AGCACTATCC GGACTTCTAA CACCATCGGT GAGTTTCATA 
CATTTTTGTT TAATTCAATT ATTCTTACTA ATCTTCTTCT 
GGCTGTGGAG CTCAGGGTGG ACTCCTGTTG GGCAGCCAGT 
jCCTTTC CTG-'ITAGA 1 



CTCTGGCTTT 
TCAGGCTGAC 
AAATGCCC1C 



CAGCAAGGTT TTAAAGAAAI 



TTCATTTCAC 



AATATTCCCA CAATTTTCIG GTCCTCTCTG GGAGAGGCCG 
CCTGGCCCTC TGCCTGCTCC TCACTCCTGG TTGGTGCTGG 
CGCGGCCAGC GCTCAGACAT 



TGATCATAAA AGAGGGACAG CATAGAAAGT 



AGGGACCTCA 



CACTGTGGTC 



ATTGACATCA TCTTCTCCCA GACCTGGTAC GACGAACGCC TCTGTTACAA CGACACCTTT 

G TTCTGAATGG CAATGTGGTG AGCCAGCTAT GGATCCC3GA CACGTTTTTT 

R CGAGCATGAG ATCACCATGC C 

I GTACACAATT AGGATGACCA TTGATGCCGG A 

^TCTCAC TCTTGCCCTC TATCTTTCTC TAGCTTTTCC 

A ATGAGATGAT CTACAAGTGG GAAAATTTCA AGCTTGAAAT CAATGAGAAG 

A AGCTCTTCCA GTTTGATTTT A 

ACAACCCCAG TTGG1GACTT CATGGTCATG A 



TCAAGACAGA GTCTGCTCCA 



:gc tctgttggag 



ACAGCCTTGG ATTTCTATAT C 
TTTGCTGTGC TCAACTTCCT G 
CGCCATCCTC GTATCAATAG CCGTGCCCAT G 
GCCCGCCAAC ATCAGGAAGC TTTTGTGTGC CAGATTGTCA C 
GAGGAGCGCC CGTCTTGCTC AGCCCAGCAG O 

CGCAGCCTCT GCTCCAAGCT GGCCTGCTGT GAGTGGTGCA AGCGTTTTAA 3AAGTACTTC 
TGCATGGTCC CCGATTGTGA GGGCAGTACC TGGCAGCAGG G 
TACCGCCTGG ATAACTACTC G 
CTCTACTGGC TTGTTTGCCT TAACTTGTAG GTACCAGCTG GTACCCTGTG 3GGCAACCTC 2520 
TCCAGTTCCC CAGGAGGTCC AAGCCCCTTG CCAAGGGAGT TGGGGGAAAG CAGCAGCAGC 2580 
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G CAGTGCTTTC GGCCCAGTTC TGGCCTCAGC CTCAAAGTGC A 
TGCTTGCCTA TACCTGGCAC CTCATTAAGA TGCTGGGCAG CAGTATAACA G 

?TATGTT CTCAGTTCTC TCTCCCTGCT ACCCCTTTCT 
3 GCATTATCCC TTTAGGAAGA GGGGGGGGCA GCAAGAGAGC 
T CTCTCTCTGC TGCTGTGACA T 



40 



TGTGATTATA GTAACTACTC CCTGCTTTAT ATGCCACCCT CTTCCTTCTC 
GTGACTCTTT CTGTAACTTT CCCAGTGACT TCCCCTAGCC CIGACCAGGC 
GGTGACTTCC TGGGGCCAAG AAACTAAGGA AACTCGGCTT TGCAACAGGC 
15 ATTGATTGGT GCCCACCCAG GGCACACTGT CGGAGTTCTA TCACTTGCTT 



'GTATGG CACTGGAACT TTGGCAAAGC ACTTTTGACA A 
GATTGGAGCT TCATGATAGC CTTGTGACAT CTTTAGGGCA GGATTC1 

CAGATGAAAA CCCTGAGTCA CAGATTTCTG TGGGACTGTG GATCTCA>_ iu ..... 

20 AAGAGCCCAC TGTCACCTTC TAGACCACAT GATAGGGCTA GACAGCTCAG T7CACCATGA 3780 

TTCTCTTCTG TCACCTCTGC TGGCACACCA GTGGCAAGGC CCAGAATGGC GACCTCTCTT 3 84 0 

TAGCTCAATT TCTGGGCCTG AGGTGCTCAG ACTGCCCCCA AGATCAAATC TCTCCTGGCT 390 0 

GTAGTAACCC AGTGGAATGA ATTTGGACAT GCCCCAATGC TTCTATATGC TAAGTGAAAT 3 960 

CTGTGTCTGT AATTTGTTGG GGGGTGGATA GGGTGGGGTC TCCATCTACT TTTTGTCACC 4020 

25 ATCATCTGAA ATGGGGAAAT ATGTAAATAA ATATATCAGC AAAGC 

30 1 

I I I I I 

S QTWYDERLCY NDTFESLVLN GNWSQLWIP DTFFMJSKRT HEHEITMPNQ 60 

17 LYTIRMTIDA GCSLHMLRFP MDSHSCPLSF SSFSYPENEM IYKW3NFKLE 12 0 

INEKNSWKLF QFDFTGVSNK TEIITTPVGD FMVMTIFFNV SRRFCYVAFQ KYVPSSVTTM 180 

LSWVSFWIKT ESAPARTSIjG ITSVLTMTTL GTFSRKNFPR VSYITALDFY IAICFVFCFC 240 

ALLEFAVTiNF LIYHQTKAHA SPKLRHPRIN SRAHARTRAR SRACARQHQE AFVCQIVTTE 30 0 

CSDCEERPSC SAQQPPSPCS PECPRSLCSK LACCEWCKRF KKYFCMVPDC EGSTWQQGRL 360 
1 SRWFPVTFF FFNVLYWLVC LNL 



Seq ID NOi 11 

Nucleic Acid Accession #: NM_001076.1 
Coding sequence: 22.. 16 14 

1 _1 21 31 41 51 

A GTAAGACCAG GATGTCTCTG AAATGGACGT CAGTCTTTCT GCTGATACAG 
C TGGAAGCTGT GGAAAGGTGC TAGTGTGGCC CACACAATAC 
ACCCATTCGA TAAATATGAA GACAATCCTG GAAGAGCTTG TTCAGAGGGG TCATGAGGTG 
ACTGTGTTGA CATCTTCGGC TTCTACTCTT GTCAATGCCA GTAAATCATC TGCTATTAAA 
TTAGAAGTTT ATCCTACATC TTTAACTAAA AATGATTTGG AAGATTCTCT TCTGAAAATT 
CTCGATAGAT GGATATATGG TGTTTCAAAA AATACATTTT GGTCATATTT TTCACAATTA 
CAAGAATTGT GTTGGGAATA TTATGACTAC AGTAACAAGC TCTGTAAAGA TGCAGTTTTG 
AATAAGAAAC TTATGATGAA ACTACAAGAG TCAAAGTTTG ATGTCATTCT GGCAGATGCC 
T GTGGTGAGCT ACTGGCTGAA CTATTTAACA TACCCTTTCT GTACAGTCTT 
G TTGGCTACAC ATTTGAGAAG AATGGTGGAG GATTTCTGTT CCCTCCTTCC 
G TTGTTATGTC AGAATTAAGT GATCAAATGA TTTTCATGGA GAGGA7AAAA 
?GACTTT TGGTTTCAAA TTTATGATCT GAAGAAGTGG 
T TCTAGGAAGA CCCACTACAT TATTTGAGAC AATGGGGAAA 
G AACCTATTGG G ~ 
3 ACTTCACTGT A 

GAAGAGTTTG TGCAGAGCTC TGGAGAAAAT GGTATTGTG3 TGTTTTCTCT GGGGTC3ATG 
ATCAGTAACA TGTCAGAAGA AAGTGCCAAC ATGATTGCAT CAGCCCTTGC CCAGATCCCA 
CAAAAGGTTC TATGGAGATT TGATGGCAAG AAGCCAAATA CATTAGGTTC C 
CTGTACAAGT GGTTACCCCA GAATGACCTT CTTGGTCATC CCAAAACCAA A 

actcatggtg gaaccaatgg catctatgag g 

t ttgcggatca acatgataac attgctcaca tgaaagccaa gggagcagcc 
c catgtcaagt a 

attaatgacc ctgtctataa agagaatgtc atgaaattat caagaattca t 
c ccctggatcg agcagtcttc tggattgagt t 

c agctcacaac ctcacctgga tccagtacca c 
:gtggca actgtgatat ttatcatcac a 
ctgttttgtt tccgaaagct tgccaaaaca g 
CAAAAGCCTG aagtggaatg ACTGAAAGAT GGGACTCCTC ctttatttca GIATCG' G 
TTTTAAATGG AGGATTTCCT TTTTCCTGTG ACAAAACATC TTTTCACAAC TTACCTTGTT 
AAGACAAAAT TTATTTTCCA GGGATTTAAT ACGTACTTTA GTTGGAATTA TTCTATGTCA 
ATGATTTTTA AGCTATGAAA AATACAATGG GGGGAAGGAT AGCATTTGGA GATATACCTA 
ATGTTAAATG ACGAGTTACT GGATGCAGCA CGCAACATGG CACATGTGTA Tj 



rotein Accession #: NP_001067.1 



187 



WO 02/098358 



PCT7US02/ 17594 



YDYSNKLCKD AVLNKKLMMK LQESKFDVIL ADALNPCGEL LAELFNIPFL YSLRFSVGYT 
FEKNGGGFLF PPSYVPWMS ELSDQMIFME RIKNMIHMLY FDFWFQIYDL KKKDQFYSEV 
LGRPTTLFET MGKAEMWLIR TYWDFEFPRP FLPNVDFVGG LHCKPAKPLP KEMEEFVQSS 
GENGIWFSL GSMISNMSEE SANMIASALA QIPQKVLWRF DGKXPNTLGS NTRLYKWLPQ 
NDLLGHPKTK AFITHGGTNG IYEAIYHGIP MVGIPLFADQ HDtJIAHKKAK GAALSVDIRT 
MSSRDLLNAL KSVINDPVYK EHVMKLSRIH HDQPMKPLDR AVFWIEFVMR HK3A ■•CFTLRVA 
AHNLTWIQYH SLDVIAFLLA CVATVIFIIT KFCLFCFRKL AKTGKKKKRD 



Coding sequence 



CTGTCATTCA 



AAGTTTACTG TATATACATT AC-ACATICCT GTICTTTTTC- 



CACTTAAAGC 

TACTTGCAAC TTCTGACAAA CCCCATTCCG CTTTGCCAGA 
A TGGAGAGATT TTTAATGTCC AGTTACCGGA 



TTTTGCAGGC TTTGGAGGTA CTCCCAGTAG 
CAGAAGAAGT GAAACGACTA GAAGAACAAG 
? AGGCTTGCTA 



CATCTGTAAT CAGTAAAATT 



CTCCTGTGGC TTGCAGCACT CCTGCTCAGT TGAAGAGGAA 
GGTACTTAGG CACCATAAAA AAGCGAAGGA AGATTTCACA 
ATGCCATAGA TCACAAAATT GAGAGTGATA CAGAGGAAAC 



ACACCGAGTA 
AATTCGCAAA AAGTCAAACT 
GGCAAAGGAT GATAGCCAGA 
TCAAGACACA AGIGTAGATC 



CCTCTGAAAG 

TTCAAGACTC TACCAACACT ACACCATCTA CACAATTCAG AGACAAGATT 
5 ATAATACATA TTTCTGATGA 
T CCCAGGTAGA 



AAACTACAAC 
GCATCGCAAG 
CT'J 3 IT G 
TTAAGTCRTT 
AAAAGATTAA 
AGATAAATGT 



GATTAAAAAA 

TGGAAAATTT GTATGCAGTA ATCAGCCAAT GTATTTATCG 

AAACATCACT TATTCAGAAA ATGGAGCAAG AGGTAGAAAA 

GATGTCATGG TATCGAGTAT TCTTTATATT CAGTTCCTAT 

CCGCCTAA1T GATGTAGTAT GAAACCCTGC ATCTTTAAGG 

TAAAAGTATT TAAACTTTCC TGATATTTAT GTACATATTA 



GCTTGTAATG 
AAAGAAAIG1 
CTCATCACTG 
GAICATGACC 



GACCATGATA 
TCCAGATGAT 
TTTGTCATGT 



CATGTGTAAG 



I 



I 



MDLSSVISKI DLHKYLTVKD YLRDIDLICS 

IKEELDEDFE QLCEEIQESR KKRGCSSSKY 

TPSTPVACST PAQLKRKIRK KSNWYLGTIK 

ACNGDASSSQ IIHISDSNEG KEMCVLRMTR 

- - SI IFQLENLYAV 



NALEYNPDRD PC-DRLIRHRA 



KERKISQAKD DSQNAIDHXI ESDTEETQDT 
LRNNSNTCMI EHELEDSRXT TACT3LRDKI 
ARRSQVEQQQ LITVEKALAI LSQPTPSLW 
ISQCIYRHRK DHDXTSLIQK MEQEVENFSC 



c;agacagta gctcagcctc 

" TC AATACCTTCA 



3 CATTTATGAA 
GAAGAAATTG GTGCTTCAGA GTCAAGAGTC 
TTTCATCTCC AGAGAAAAAG AGAACCTGTT 
C CAGGGGCAA GGTTTCTGAG GGACTTCAAG 
TCACTAGCAC TCTCATTTCT 



AAGATTAACC TGGTCTCCAG 
GCACCTTTCA ATTACACACA 
CCTGGGATTG ATTCCGGAGA ACAACTTGCT 
CAAATTTGGT T 
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20 



GCCTTCTGAA GATGGTAAAG AACTT ' 3G '.I3TGGT3G 840 



AGTAAAAAGG ACTGTAGGAG GCCAAGACAG GTACAGGAGG CACCACACTA 
r GAGACAATGA AAACAAACTT 
T TGGACAAAAA TC.AATTGACG 



T ACCTCATCTC 12 60 

T TTTGCAGTGG 1320 

AGACATATGG CCAACTCCAC CTTACCCAAG TGGCTGAAAG TCACTGCACC AGTAATGGCA 1380 

CAAACCAATG TGAGATGATT CCTGATATGA TACACTAAAA AGGGCACTGT CTCTTCTGCA 1440 

3 TAAGCTGACA CTGAAACTAA TAATTAGGCA ATGTCAAGCA 1S00 



ATAATTGTTT C-AGGCCAC-GA GTTCCAGATC 
AGCCTGGGCA ACATCATGCG ACCCCATCTC T, 

GTGGTAGCAT GCACCCGTAG TCTCAGCTAC TCAGGAGCCT GAGGCAGGAG G 
ACATAGGAGA TCGAGGCTGC TGTGAGCTAT GATCGTGCTA C 
CACAGCAAGT T 1 

GTGACAATAA AAATGGAGAA AAAGTAGGCT GACTCAGGAA A 
TACCTCAAAG ATATTGTAGA TTTGATTCGA GACCACCACA A 
AAGTGAGTCA CACAAATTGT TTTGTTTCCT TGTGAATATG A 
GGCTGGGTGT GATGGCTCAT GCCTATAATC CCAGTACTTT AGGAGACGGA GGCGGGAGGG 2100 
TCACTTGAGC CCAGGAATTG TGAGATCAAC CTGGGCATAT AGGGAGATCC TGTCTCTATT 2160 
25 TAAAAAAAGA AGCTATGTTT ACACTACACT ATAGTCTATT TAAAGTGTGA AATGGCGTTA 2220 

TGTCCTTAAT TTTAAAACTC TTGATGCTGG CTGGGTTCGG TGGCTCATAC CTGTAATCCC 22 80 
ATCACTTTGG GAGGCCAAGA CAGGTTGATT ACTTGAATTC AGGAGTTCAA GACCAGCCTG 2340 
GACAACATGG CAAAACACGT CTTTAAAAAA AGAAAAGAAA AAAGAAAAAC AGAAAGAAAA 2400 
^ AGAAGAAAAA CTACTTGCTG CCCTTACTTG AAGCTCAATT ATTTAAAAC 

Seq ID NO: 16 DNA sequence 

Nucleic Acid Accession #: CAT cluster 

1 11 21 31 41 51 

35 | | | | | i 

CTTTTTTTTT TTTTTTTTTT TAGTAGAGAC AGG " ( Gl ^TGGTCT 60 

CGATCTCCTG ACCTCATGAT CTTCCTGCTT TGGCCTCCCA AAGTGCTGCG ATTACAGGCG 120 

TGAGCCACTG CACCCAGCCC AGAGTTTTTT TTAACAAGGT TCTTCTCAGC AATTCTAGTA 180 

. TCCAGATATA GGCCCATCAT AGACATCACA CAAGCGTGTA CTTCATAATC CTGGTGAATA 240 

40 CAGAAGTTTC CTGGACTCCT TGATGAGCTA CTGCTTTCGC TCCTATATCA GTGTTTTCAG 3 00 

CTGATGTCAT TTGTGATTGT GTTTCTGACT TTCTGTAGGC AGAAAAAAAC TTTCATTTTT 360 

TTTTTGCTTA CATGCACATA AATGTAAGCG CTAATTCTTA TATTAAACTG TTTATTTCTA 42 0 

TAATACTTAA TTGGCTGTTT TCCTGGCTGA ACCAAACCAA GAGCATAAGG AATGATAACC 48C 

TTCAAAACTG ATTAAATTAG AGAT CAATAA ATGGAGCTGT TTTAATTCTA TTATTCTTCT 540 

45 TTCATAGATT AAATAGAAAA TTTTT 



ACTAATAATT GTTTTCCTAA TTAGCTCCTT AGTCCTCTAT ATCATCTCGC T. 
AACAAAACTA ACACATACAA GCACAATAGA TGCACAAGAA GTTGAAACCA TTTGAACTAT 
TCTACCAGCT GTAATCCTTA TCATAATTGC TCTCCCCTCT CTACGCATTC TATATATAAT 
C AACAACCCCG TATTAACCGT TAAAACCATA GGGCACCAAT G 



T AAAACTGATG CCATCCCAGG CCGACTAAAT CCAGCACAGT ACATCAACCG 
A TTCTATGGCC AATGTCTGAA TTTGTGGTCT TACCATAGCT TITTGCCATT 
GTCCTAGAAT GGGTCCCTAA AATATTTCGG NACTGGTCTG 



2 Acid Accessic 



GCCTTCGAAC TTCCTGCTCC TTTAACCGTA ACTCAGCCTT TTCAGATTCA ATCTGGAGGA 
TAGCCAGGGT TT1CTCGTAG TTCTTTTCAG GGCCATCATA GAAATTCCGG GCGATCCATC 
TTGATATCGG ATGCTTGTAA TACTCCCAGT GTTCAGGGAT GTAGCCTTCT GGGA7TTCTG 
CAAGCTCGGC TTCACCAATA AATATGTTCA CCAGTGTTAT GCCAATTATA ACTGGGATCC 
CAGTCAACAT AAGGTAGAAT TTCATTAACC TCAAGAAGGG AGCGTCATAG TATAAAGAAG 
GCTTGACGAC AAACAGTCTC TTGCCATGTC CCCACT3TGC CGCACAG3AG C3ACASTCTT 
CGGAAANTCC GCGTGAGAAA ACTTCCGACT CCGAGTCTAG GACCASC3CG GCG3CAAGAC 
CACGCTGTCA GCGCGGAGAC CGAANCCGCT GCAGCAGCTC ATGGCCGCCA TGG 
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TAGTCCAGTN A 



TACTTTA ATTTCGCTTT TCCATAATAC TGGTATTCCA T 



TTTAGTCAAG CATTCAGTAG AGG CAATAAT CAAACC7CTA TCCCAACATT TTACACTTGT 
T ACAACATACA TTTTTGGCAA TTTACTATTA A 
3 GCCCATATAT ATATATATAT ATTTTTGGAC A 




5 CCCGGGACGC G 

; ATGCGGCGGC AGCGGCG 
GGAGGCGGCG TCCGGCGCCA CTGCAGCCGC A 
GCGCGCCCGC CCGCCGTCGC TGCCCTTCCT GTTGGGATTA TCTTCTGCTC CCCGCTGCTT 
: CGCGGTCGM GCCGCCTCTA GGCTTCAGCG GCTCGGACTC CTTGGCAC-CC 



TCAACATCAG GAAGGTGCTG GACGGCCTGA CCGCAGGCTC GTCCTCC-GCC TCGCAACAGC 



TGGCCTTTGA TCCCGTTCAG AAGATCCTGG CGGTAGGAAC CCAGACTGGT GCTTTAAGGC 
TCTTTGGTCG TCCAGGGGTG GAATGTTATT GCCAGCACGA CAGCGGAGCG GCAGTGATTC 
AACTCCAGTT CCTGATTAAT GAGGGAGCCC TTGTGAGTGC 
ACTTGTGGAA TTTACGTCAG AAAAGGCCTG CTGTGCTACA 
AAAGGGTTAC ATTTTGCCAT CTGCCTTTCC AGAGTAAGTG GCTCTATGT3 GGCACGGAAC 
GAGGTAATAT ACATATTGTC AATGTGGAGT CCTTCACACT CTCAGGCTAC GTCATTATGT 
GGAATAAAGC CATCGAACTG 
ATAATCCCAT GGACGAGGGG 

GGGACCTTAA GTCAAAGAAG GCTGACTACA GATACACTTA CGACGAGGCT 
A TCATGAAGGA AAACAGTTTA TTTGCAGTCA TTCTGATGGT 



TAAACCATCC GAAGAAACCC GAGCCGTGCA AGCCTATCCT CAAGGTGGAG TTCAAGACAA 
r CGGGAGGCTT ATCATATGAT ACCGTGGGAA 
A AAAGCACGGC AGTGCTGGAA ATGGACTATT 
A CGCCATATCC AAAT2ATTTT CAGGAGCCGT 



GAAGACCTTG 
CAATTGTCGA 
ATGCTGTGGT 
ACCCTATATT 
AATATTTTGC 
AGAAACGTCA 
GTGCTCAAAG 



TGAGAATCCC 
TGATTGTCCT 
AGGTTACAGC 
TTACCCAGAA 



ATGCATGGGA 
CTCTGTGAAA 
GAGAAGGATT 
TACCCTTTGA 
CTGGACCTTA 



CCATTCAGAT CATCTCCTGG 



ATAATTATTA 
CTACAAGTAC 
GACAGACAGA 
TGCCCAGAGA 



CAGGGCATGC 
TGTATAAATT 
ACACCGACAT 
GCAGAATGCT 



GTCCCCTGTT 
TTATTCTGTT 
TGGTGGTAAT 
TGATGGCTCA 
AAAAACATCT 
TGTAGATGAA 
GTGCATAGCC 



TTGAAGTCCG ACTGTTATAT 
CCCCTTTGTC CACTCCCGTG 
CGTCTACCAG CAGCAGCTCA 
AAAACTCACC ACTTAAACAG 
GGGTGGGTGG AGAACCCCCG 
TGGTGGTTTT CGGCAACTCC 
TGCTCAACCT CAGCACCATT 
GGTCGCCCCG CAAATCTCGA 



TCTCCCGGCT A' 



GCCGGAGGGT 
CATCCCCCCT 
TGTACCGTGT 
GCTAGTCATC 
ACTCAACTCT 
CTACCTCCAG 
G GCTCAAATGA TCCTTATCGG 



ATTAAATTCT 
AAAGTATTTG 
GATCCATATG 
GGAGTGTCGG 
ATCCCGATGC 
GAGCAGCCAC 
CAGTCTCATC 
TTAAAAGTTA 



GCTTGCCAAC TGATCTAAAG CCTGATTTAG A 



C CGACTTCCGC A 



AAAGCAGTGC 
AGAGAACCGA 
ACCGAAGGAA 
AGGAAATTAA 
AGCAGATCTC 



CAGAGCACAA 
TCTCAGTGTC 
CTGAAAAGCA 



TTTCACAAGG 
AACTGCCTTT 
AGTGATTGTG 
ATTTCTGGAT 



AAGGCAGACT 
GTCATCACGC 
TCTCCAAGCG 



TCGCCTGCTT 



AGCAAAGGTC 
GTCCTTCGTG 
CTGTGCCAAC 
CTACTACCTG 
GCAAGCCTTA 
GTGTGAAAAC 



GAAAAAGACG 
TCTCAGGAAA 
ATCTCACTGC 
CTCCGTGGAG 
GGCCACATTA 
CCCCTTACCA 



CCTCCCCCTC 
TGAATCTCCC 
GTACTATATT 
GCTTAATGCC 



CCTGGGGCCT GAGCAGAGAC 
GAGGTTAAAA GGTGCGATCT 
ACCTGCATAC GAACCCTGGA 



CTTCAGGAGA 



TTAGTGAAAA CCAGTACGCA GTGATATGTT 
CAACCCAGAA CTGTGCATAC AAGCAGAACA 
ACATTGTCC-C CCTGAGTAAC AGTGTCTGCC 
TGACTTTCAG TTTGCCGAGC TTGAGC-CCTC 
ACATGCGGAT AGCCAGGACA TTCTGCTTCG 
CACCTACCGA AATCCAGAGA CTCACCTACA 



AGCACATCCC GGGTCCTGGC 



CCGAGCCAGG CTGGCCCTCG 



AAGGCTTATT TGGAGGTGGT GCACAATCTC 
CCTCGGGAAA GGCGTCAAGC- AGCCTTGCAC 
GTGTGAAGGG AGCCGCGTCC- GGAGTGGTGG 
ACGAAAGAGG ACAGAAGCTC AGCGACTTGG 
CAGACTCGTT 



2040 
2100 
21.60 

2340 
2400 

2520 
25B0 

2820 

3000 
3060 

31B0 



AGATAACATA AAAGGGATGC ACACTGCTGA C 



3CGTCTT7 CCCAGCACAA TCATGCACTT 
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I I I I I I 

MRKFNIRKVL DGLTAGSSSA SQQQQQQQHP PGNEEPEIQE TLQ3EHFQLC KTVRKGFPYQ 
PSALAFDPVQ KILAVGTQTG ALRLFGRPGV ECYCQHDSGA AVIQLQFLIN EGALV SALAD 
DTLHLWNLRQ KRPAVLHSLK FCRERVTFCH LPFQSKWLYV GTERGNIHIV NVESFTLSGY 

VIMWNKAIEL SSKSHPGPW HISDNPMDEG KLLIGFESGT WLWDLKSKK ADYRYTYDSA 

IHSVAWHHEG KQFICSHSDG TLTIWNVRSP TKPVQTITPH GKQLKDGKKP EPCKPILKVE 

FKTTRSGEPF I ILSGGLS YD TVGRRPCLTV MHGKSTAVLE MDYSIVDFL? LCETPYPIOF 

QEPYAVWLL EKDLVLIDLA QNGYPIFENP YPLSIHESPV TCCEYFADCP VDLIPALYSV 

GARQKRQGYS KKEWPINGGN WGLGAQSYPE IIITGHADGS IKFWDASAIT LQVLYKLKTS 

KVFEKSRNKD DRQNTDIVDE DPYAIQIISW CPESRMLCIA GVSAHVIIYR FSKQEWTEV 

IPMLEVRLLY EINDVETPEG EQPPPLSTPV GSSTSQPIPP QSHPSTSSSS SDGLRDNVPC 

LKVKNSPLKQ SPGYQTELVI QLWVGGEPP QQITSLALNS SYGLWFGMS NGIAMVDYLQ ' 

KAVLLNLSTI ELYGSNDPYR REPRSPRKSR QPSGAGLCDI 1 



P EQRLLQPVIV SPSGTILRLK GAILRMAFLD AAGCLMPPAY 

EPWTEHNVPE EKDEKEKLKK RRPVSVSPSS SQEISENQYA VIC3EKQAKV ISLPTQMCAY 

KQNITETSFV LRGDIVALSN SVCLACFCAN GHIMTFSLPS LRPLLDVYYL PLTNMRXART 

FCFANSGQAL YLVSPTEIQR LTYSQETCEN LQEMLGELFT PVETPEAP^R GFFKGLFGGG : 

AQSLDREELF GESSSGKASR SLAQHIPGPG GIEGVKGAAS GWGELARAR LALDERGQKL : 



Nucleic Acid Accession #: CAT cluster 

1 11 21 31 41 

I I I I I 

TCCCATCGGG TGAACCGTGG TCTTGTTCCG TCCGCCCACA ATCGCTCTCC A 

30 GCCCCGGCAA AGCCTGGCTC GTTCACAGCT CTCTCGCACC TCCTGGAGCT TCAGCTTCTT 
CCGTTGCAGA GAAGCTTTAT GGGCCAATTC GTTCGGCATC CCGGGGGCAG GTGCGCGGTG 
CGCGGGGAAG AAGAGGATTT GACTGCGGTT CTCCACCCCC GGCGCCCAAC CTCCACCCCG 
GTGCGCGCGC TCTTCCAGGC TCCTGCTGGT CCCACTTGCC AGGAGTTAGG TCTCAGGTCA 
GCCTGAGCTC CTGAGACGCC CAGGCCCGGA AAGACACGTA GGGGAAACCA TCTGCTCACT 

35 TCTGTCCTGT CCGGAAGGGA TCCCTTTCTG ACGGGAAAGA AAGGCGCTAA ACAAGCACTG 

GCCTTGAGAT AAGCAATGCT GAAGCACTTG CAGCTCACCT ATTACCATAA ACTGACTGAG 
CCCTCCCTAC ACAAGCCGTA ACTACTGCTT TGATTGGACA AGAGACTGAT TTCAGTAGTT 
TTCTCTTGAT AAGAGACCAC TGGCCGTGGG CGGGTTCTGG ACAGTTTACA GAAGC7ATGC 
. ACTTGATTGC CTTTGTGTCC CTGCTTCACC TTTTGAAGCA TAGGGCCTAA TTATAATGTA 

40 TTTAAATGTT GTCTCCACCC CAAAGTGAAC ATGGGTTGCA TGTAACAGGC ATGTTTACTC 
AGCATGCATG CAGCAGGATC CCTTCACAAA TATTCAGAGC TCCCCCTATT CCCTGTTGAA 
TATGTATATG TGGCCACCCA GATCAACGTA AATCACTATT CGCCCTCCCC TCCCTGGAAA 
CCTACTTTTC GGGTTTCAGC AGGAAGCTAT GCCTCCCAGG CTTGTCGAAG AGGGCCCATT 
TTCGGGCTTG ATAACCCCTT TATAAAAAAA TAAAATCTCC TTTCTAAATT TAAAATACAA 

45 CCACACCACC GGCCCGCAAC TATTGGGGGG GAAAAAGAAT GAAGACACAC GGTACATAGT : 
TTCATGCACA TTGTTAAGGA GACAGGTGCC CCCAAGCAGG CGGACATCAC GCAGTACGCA : 
GCTTGAGCAT G CCGAAGACG CGAGCGACTC ATAGAACACG ACGACGCTCG CAAGGCACTA : 
AGCATAGCTA CTACCACTCG TCGAAGAGTC ATACACAGAT TTCTATTGGC GA 

50 Seq ID NO : 2 3 DNA sequence 

Nucleic Acid Accession #: CAT cluster 

i i 1 r i 1 r r 

JJ CTATGAATCT CGGAAATTAC TCAAACCATC AGCCTCTGCA AGAAC-CAAAG TGGACGGCCG 

AGGTCAGGAG ATCGAGACTG TTCTGGCTAA ACCAGTGAAA CCCCCTCTCT ACTAAAAAAA 
TAAGAAAAGC GAAGTGCATC TCCCATAAAC GAGGTACTGC AGGAAGAAAG CAGAAAATGA 

60 AGGAAAGGAA ACATTTTCAA ATAAGCATTT GGAGATGGGA AAAACACCTT GAAACAGAAA 
TTCATAAAGT ACAGAATTTT TTTTTAAGTT AAAAAAGGAA CAATAATAGA CAGAAAATGA 
ATGAAAAATT AAATGTCATA TCAGAAGTGA AGATAAATTA AAAGTGGTCA AAGGAGAAGA 
GATCTAAATG CAAACTTAAG AAGGGGCAAT TTTTTTTTTT TTTTTTTTTG AGACGCAGCC 
TCACTCTGTC GC 

65 

Seq ID NO: 2 4 DNA sequence 



AAGGGACGCA CCACGCCAGC CCCAGCCCGG CTCCAGCGAC AGCCAACGCC TCTTCCAGCG 



: CCCCGTCGGC CCAGCGCTGC CAGCCCGAGT TTGCAGAGAG GTAACTCCCT 
TTGGCTGCGA GCGGGCGAGC TAGCTGCACA TTGCAAAGAA GGCTCTTAGG A 
CTGGGGAGCG GCTTCAGCAC TGCAGCCACG ACCCGCCTGG TTAGAATTCC GGCGGAGAGA 
ACCCTCTGTT TTCCCCCACT CTCTCTCCAC CTCCTCCTGC CTTCCCCACC CCGACTGCGG 
AGCAGAGATC AAAAGATGAA AAGGCAGTCA GGTCTTCAGT AGCCAAAAAA CAAAACAAAC 
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AAAAACAAAA AAGCCGAAAT AAAAGAAAAA GATAATAACT CAGTTCTTAT TTGCACCTAC 780 

TTCAGTGGAC ACTGAATTTG GAAGGTGGAG GATTTTGTTT TTTTCTT7TA AGATCTGGGC 840 

ATCTTTTGAA TCTACCCTTC AAGTATTAAG AGACA3ACTL, rGAGCCT ( ^GGGCAGATC 90 0 

TTGTCCACCG TGTGTCTTCT TCTGCACGAG ACTTTGAGGC 7GTCAGAGCG CTTTTTGCGT 960 

GGTTGCTCCC GCAAGTTTCC TTCTCTGGAG CTTCCCGCAG GTGGGCAGCT AGCTGCAGCG 1020 

ACTACCGCAT CATCACAGCC TGTTGAACTC TTCTGAGCAA GAGAAGGGGA GGCGGGGTAA 1080 

GGGAAGTAGG TGGAAGATTC AGCCAAGCTC AAGGATGGAA GTGCAGTTAG GGCTGGGAAG 1140 

GGTCTACCCT CGGCCGCCGT CCAAGACCTA CCGAGGAC-CT TTCCAGAATC TGTTCCAGAG 1200 

CGTGCGCGAA GTGATCCAGA ACCCGGGCCC CAGGCACCCA GAGGCCGCGA GCGCAGCACC 1260 

TCCCGGCGCC AGTTTGCTGC TGCTGCAGCA GCAGCAGCAG CAGCAGCAGC AGCAGCAGCA 1320 

GCAGCAGCAG CAGCAGCAGC AGCAGCAAGA GACTAGCCCC AGGCAGCAGC AGCAGCAGCA 1380 

GGGTGAGGAT GGTTCTCCCC AAGCCCATCG TAGAGGCCCC ACAGGCTACC TGGTCCTGGA 1440 

TGAGGAACAG CAACCTTCAC AGCCGCAGTC GGCCCTGGAG 7GCCACCCCG AGAGAGGTTG 150 0 

CGTCCCAGAG CCTGGAGCCG CCGTGGCCGC CAGCAAGGGG CTGCCGCAGC AGCTGCCAGC 1560 

ACCTCCGGAC GAGGATGACT CAGCTGCCCC ATCCACGTTG TCCCTGCTGG GCCCCACTTT 1620 

CCCCGGCTTA AGCAGCTGCT CCGCTGACCT TAAAGACATC CTGAGCGAGG CCAGCACCAT 1680 

GCAACTCCTT CAGCAACAGC AGCAGGAAGC AGTATCCGAA GGCAGCAGCA GCGGGAGAGC 1740 
GAGGGAGGCC TCGGGGGCTC CCACTTCCTC C 



TTTCAAGGGA GGTTACACCA AAGGGCTAGA AGGCGAGAGC CTAGGCTGCT CTGGCAGCGC 



AGCACTGGAC GAGGCAGCTG CGTACCAGAG TCGCGACTAC TACAACTTTC CACTGGCTCT 
GGCCGGACCG CCGCCCCCTC CGCCGCCTCC CCATCCCCAC GCTCGCATCA AGCTGGAGAA 
CCCGCTGGAC TACGGCAGCG CCTGGGCGGC TGCGGCGGCG CAGTGCCGCT ATGGGGACCT : 
GGCGAGCCTG CATGGCGCGG GTGCAGCGGG ACCCGGTTCT GGGTCACCCT CAGCCGCCGC 
TTCCTCATCC TGGCACACTC TCTTCACAGC CGAAGAAGGC CAGTTGTATG GACCGTGTGG 



GGCGGGCCAG GAAAGCGACT TCACCGCACC TGATGTGTGG TACCCTGGCG GCATGGTGAG 
CAGAGTGCCC TATCCCAGTC CCACTTGTGT CAAAAGC 
CTACTCCGGA CCTTACGGGG ACATGCGTTT GGAGACTGCC Ai 



TCACTATGGA GCTCTCACAT GTGGAAGCTG CAAGGTCTTC TTCAAAAGAG CCGCTGAAGG 2 880 

GAAACAGAAG TACCTGTGCG CCAGCAGAAA TGATTGCACT ATTGATAAAT TCCGAAGGAA 2940 

AAATTGTCCA TCTTGTCGTC TTCGGAAATG TTATGAAGCA GGGATGACTC TGGGAGCCCG 3000 

GAAGCTGAAG AAACTTGGTA ATCTGAAACT ACAGGAGGAA GGAGAGGCTT CCAGCACCAC 3 060 

CAGCCCCACT GAGGAGACAA CCCAGAAGCT GACAGTGTCA CACATTGAAG GCTATGAATG 3120 

TCAGCCCATC TTTCTGAATG TCCTGGAAGC CATTGAGCCA GGTGTAGTGT GTGCTGGACA 31B0 

CGACAACAAC CAGCCCGACT CCTTTGCAGC CTTGCTCTCT AGCCTCAATG AACTGGGAGA 3240 

C CAAGGCCTTG CCTGGCTTCC GCAACTTACA 3300 

:TGGATG GGGCTCATGG TGTTTGCCAT 33 60 

h TCCTTCACCA ATGTCAACTC CAGGATGCTC TACTTCGCCC CTGATCTGGT 3420 

TTTCAATGAG TACCGCATGC ACAAGTCCCG GATGTACAGC CAGTGTGTCC GAATGAGGCA 3480 

CCTCTCTCAA GAGTTTGGAT GGCTCCAAAT CACCCCCCAG GAATTCCTGT GCATGAAAGC 3540 

ACTGCTACTC TTCAGCATTA TTCCAGTGGA TGGGCTGAAA AATCAAAAAT TCTTTGATGA 3600 

ACTTCGAATG AACTACATCA AGGAACTCGA TCGTATCATT GCATGCAAAA GAAAAAATCC 3 660 

CACATCCTGC TCAAGACGCT TCTACCAGCT CACCAAGCTC CTGGACTCCG TGCAGCCTAT 372 0 

TGCGAGAGAG CTGCATCAGT TCACTTTTGA CCTGCTAATC AAGTCACACA TGGTGAGCGT 3780 

GGACTTTCCG GAAATGATGG CAGAGATCAT CTCTGTGCAA GTGCCCAAGA TCCTTTCTGG 3840 

GAAAGTCAAG CCCATCTATT TCCACACCCA GTGAAGCATT GGAAACCCTA TTTCCCCACC 3900 

CCAGCTCATG CCCCCTTTCA GATGTCTTCT GCCTGTTATA ACTCTGCACT ACTCCTCTGC 3960 

AGTGCCTTGG GGAATTTCCT CTATTGATGT ACAGTCTGTC ATGAACATGT TCCTGAATTC 4020 

TATTTGCTGG GCTTTTTTTT TCTCTTTCTC TCCTTTCTTT TTCTTCTTCC CTCCCTATCT 4080 

AACCCTCCCA TGGCACCTTC AGACTTTGCT TCCCATTGTG GCTCCTATC1 GTGTTITGAA 4140 

TGGTGTTGTA TGCCTTTAAA TCTGTGATGA TCCTCATATG GCCCAGTC-TC AAGT7GTGCT 4200 

TGTTTACAGC ACTACTCTGT GCCAGCCACA CAAACGTTTA CTTATCTTAT GCCACGGGAA 42 60 
3 CTAAGATTAT CTGGGGAAAT CAAAACAAAA AACAAGCAAA C 



I I I I I I 

MEVQLGLGRV YPRPPSKTYR GAFQNLFQSV REVIQNPGPR HPEAASAAPP GASLLLLQQQ 
QQQQQQQQQQ QQQQQQQQET SPRQQQQQQG EDGSPQAHRR GPTGYLVLDH 2QQPSQPQSA 
LECHPERGCV PEPGAAVAAS KGLPQQLPAP PDEDDSAAPS TLSLLGPTFP GLSSCSADLK 
DILSEASTMQ LLQQQQQEAV SEGSSSGRAR EASGAPTSSK DNYLGGTS7I SDNAKELCKA 
VSVSMGLGVE ALEHLSPGEQ LRGDCMYAPL LGVPPAVRPT PCAPLAECKG 3LLDDSAGKS 
TEDTAEYSPF KGGYTKGLEG ESLGCSGSAA AGSSGTLELP STLSLYKSGA LDEAAAYQSR 

DYYNFPLALA GPPPPPPPPH PHAR1KLENP LDYGSAWAAA AAQCRYGDLA S 

GSGSPSAAAS SSWHTLFTAE EGQLYGPCGG GGGGGGGGGG GG3GGGGGGG G 
GYTRPPQGLA GQESDFTAPD VWYPGG1WSR VPYPSPTCVK SEMGPWMDSY S 
TARDHVLPID YYFPPQKTCL ICGDEASGCH YGALTCGSCK VFFKRAAEGK QKYLCASRND 
CTIDKFRRKN CPSCRLRKCY EAGMTLGARK LKKLGNLKLQ EEGEASSTTS PTEETTQKLT 
VSHIEGYECQ PIFLNVLEAI EPGWCAGHD WNQPDSFAAL LSSLNEL3ER QLVHWKWAK 
ALPGFRNLHV DDQMAVIQYS WMGLMVFAMG WRSFTHVNSR MLYFAPDLVF NEYRMHKSRM 
YSQCVRMRHL SQEFGWLQIT PQEFLCMKAL LLFSIIPVDG LKNQKFFDEL RMNYIKELDR 
IIACKRKNPT SCSRRFYQLT KLLDSVQPIA RELHQFTFDL LIKSHMVSVD FPEMMAEIIS 
VQVPKILSGK VKPIYFHTQ 
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GGAGGCTAAG AGTGTTTAAT TTTCCCCAAG TTCCAGTGCT Al 
NNTGAACCTG TGTTAATGGT GTTTCTAGTC GATGCTG7TA TCTGTTGCAC CACA7T7TGA 
ATAATCTTGG ACTTTCAGAG TATGAAGGAC GATTAAATAT AACCCTTTGG TATAAATGTT 
CTCTCTCTCG CTCCTCTGTA ACAATTGGAG AAACAGAGIT CTAACAATAI TAAAATCAGC 
CATAGACAGA GAGTAGTGAG AAATATACTT TTTTTAATAC AGAAGGTTCC CTGAAGTACT 
TTTAGTATTA TTCTAAATTA AGCAATAACC AATGAACAAT TTTGGTCATA AGCAGTTTCT 
CTCCAGAAAA AAAAAAAAAA AGTCGAC 



Seq I 



I 



I 



I 



, I 

3 CCCTCTGCTG C 

AATGCCGAGT TCTGCCCAGC TCTTGTTTCT GAGCTGTTAG ACTTCTTCTT C 
CCTCTGTTCA AGTTAAGTCT TGCCAAATTT GATGCCCCTC O 
TTAGGAGTGA AGAGATGCAC GGATCAGATG TCCCTTCAGA AACGAAGCCT CATTGCGGAA 
GTCCTGGTGA AAATATTGAA GAAATGTAGT GTGTGACATG TAAAAACTTT CATCCTGGTT 
TCCACTGTCT TTCAATGACA CCCTGATCTT CACTGCAGAA TGTAAAGGTT TCAACGTCTT 
GCTTTAATAA ATCACTTGCT CTAC 



Seq ID NO: 2 9 DNA sequence 
Nucleic Acid Accession #; NM 
Coding sequence: 1..50S1 



ATGGCTCAGA 
ACAAGAGCAA 
AAACTGCAAA 
ACCAGAAAAA 
TCAGATTCCC 



AAGATGTGGA 
AGGATAGACA 
AAGCACAGGT 
AAAAAAGAGC 



CATTTTCACA TCC3GAACCA 



CCTATTCTGA 
CAGTGGCCAC 
ACTTACAGTA 
TCTACAGAAC 
ACACCTGCCA 
ACTGACATGG 
AAAGCAAGGA 
AAGTCTGAGG 
GTGGATAATG 
AAGGATCCTT 
AGAAAGGTGA 



GCCCTTCCTT 
CTGGATTACC 
AACAGGCTGC 
CTATATATTT 
CACCCTTTCA 



ATTAGATATT 
CAGTTTCGAG ACTAAAAAAA CACCTGTATT 
TTCAGCAl 



TGGGCCTTCC 
ATTCCAAAAT 
AAGTCTTCCG 
TCCACAAGGA 



ACTTATGCTT 
GGCTTCAATC 
GGACAATCTC 
AGCTTACCTA 



CTGATTTGGA 
ATATCAGTAA 
TGGAGGTATT 
GGGATGCTGT 
ATGGAAAATC 
CTCAGCTTGC 
TGCCAACTGG 
TTTGTCGATC 



ATTTGACTGG 
AGACCATGAG 
TCTTCTTGAA 
CCTTTCTGTG 
AAAAGCCCAG 
AAGTTCTCTT 
CATTACAAAA 



TCAAAAGTCA 
TTAGACTTGG 
GAAGAGAAAA 
GAGAGATCGA 
GCAACTGTTA 



CTGCAGAATA 



CTTCAAGAAG 
T1GAAGACCA 
GTCACAGCGC 
GAAGGATTTC 
ATAATGCAAG 
GTTCTAAAAG 
GAGCATATTC 



TACCTTCTAT 
CAAGAATGCC 
CATATTTC7C 
TCTATCGTCC 
CA3AATTTTT 
GCAATCTACA 
ATCCTCTAAG 

CAC-CAAATTG 
CAAGAAGCCA 
CTCAGAAAGA 
TT3AAGTACA 
AATTTCCATA 
AAAGAAACAT 
AGCTACCAGT 



ACCAGTTACT 
TCAGAGAGGA 
TTATCCTTCT 
CACTTTTCCA 
ATATCCTTTG 
AGTAGTCAGT 
AAAAAATGGG 
GGTATCTCCA 
TAAGCCTAAG 
TTTGCTAGCA 
TCATCTTGAA 
GTCTTTAAAT 



ATGCGGAGAA 
TACTTTTACG 
GGTACATGAT 
AGAGGAAGTG 



TGTATCAAAT A 



TGAAGGTCTC 
GTTCTACTGT 
AAGTAGATGT 
ATCATTGCCT 
GACTACAACT 
GCAGAAGATG ATGAAACACC 
TGCAAAGAAG CCATGACGAG 
GTAGAACTGG CTCTTCAAAT 
GTAAGAAAAA TCTGTAGTGC TTTAGATGGT GTCGAGACTC TTGCCATTAC AGAATCACTA 
AAGAAGCTAA AGAGAGCAGT TAATCTTCCA AGGAGTAAAA CTGCTGATGT GACTTCTTTG 
TTTGGAGGAG AAGACACTAG CAGGAGTTCA ACTAGGGGCT CACTTAATCC TGAAAATCCT 
GTTCAAGTAA GCATAAACCA ATTAACTGCA GCAATTTATG ATCTTCTCAG ACTCCATGCA 
GGAGTCCTAC AGACTGTGCC CAAAGTAGCA AGAGTGTCAA GGAAGCATGG 
AGCAGCTCCA GTTTACTATT TTTGCTGCIC ATGGAATTTC 
ATGAAAAATA CTACTTGATA TGTTCACTGT CTCACAATGG 
TTCAATCAAA GAAGGTTGGC ACTTACAAGA ATTTCTTCTA TCTTATTAAA 
TAATCATTTT TCCTATCCAG ATATCACAAT TGCCATTAGA ATCAGTTCTT 



ACTACAACAG 
GTATCAAATT 
TTTAAACCTA 
TGGGATGAAC 
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CACCTTACTC 
CAGAGAAAGG 
TTTTTAACAT 
CCTGGAACAG 
CCTTCTCCTG 
CAACATAACT 
AAAGACTCAT 



GACCAGAAGC 
GTGGAACTAA 
TTACCAAAAA 
CATTTGATAT 
TAGAAACACT 
CACTTGGACT 



AGGATATGTC 
TATTTATACA 
AGAGAATGAT 
TTCTAAAGAA 



AGCAGTGGAA 
GTTTCTTTAC 
CTTTGGACTT 
ATGGAAAGAA 



TGGGGTAA1C 
ATTGCATTGG 
ACCTGGATTG 
GCTTTGAAAT 
TTGGGAAATA 



TTGCCAAAAC 
AACTTCTTGA 
AGGCCATTAG 
ATGAAATTTA 
TCCAGATAGC 



ATAAAAGGGA 
GATAAAGCTT 
AAAATATTAG 
CTTCACCAC-T 
GCTGATCAGG 



GTTCCCCTGA TTC3AATAAG 

CTCTTTGTGA CTTTAGACGG 

CATCACATAC AAATTCTGTT 

TAGTGCTACA 3GTTGATTTT 

TTGACAGAAG CAT7ATACAC- 

AACTTCTTGA TATTCTTCAT 

TTTTA7GGGA GAAACGTTAT 



ACACAATTTA T 



GGCCTGCATT 3TACCCACTA 

AAG7AAGATC CCTAGCTGTG 

TTCTTCCACA 3TTTGTACAA 

AATTCCTTTT 3TCCAGGGCA 

TCAAAGATGC CCTGCATGAT 



CATTATATTT 



AAGTTACCAG 
GCCAAAAGGA 
GTAGCAGAGT 
GAAGGGATAG 
GGAGCTGTGA 
ATCAAAGATC 
CTTCCAGATA 
CCGACATTCA 
GAACTTCAAC 
GTAACCCTGC 
ACTGCGGCAA 



ATGTACTCCG 
TCGAATTTCA 
GCTTTCCTAA 
AAATTGAGTT 
GTGATCTTGT 
CTAGGTCTGC 
AATTATCCAT 
TTGTTACTGA 
ACCACAAAAC 
ATGAAATGCT 
TAAGTGTACT 
CTTTGAAAGA 
CATACTTGTA 



AATTTTGTGG 
GGAACTTCAC AATAAGCTCA 
TAGGATGGTT CTAGGAAGAA 
AAACAGTTAC TTACAGAGTT 



AGATGCAGGT TCCTTCAGTC 
CTCTTACCGA AATGGTACTC 
AGATGGAGCT GACCCAAATC 
ATCCAAACGT AAAACCAAAA 



TACTTCGTGA 
CTACTCCAGG CCAAATAGGA 
TTTTCATCAT GGTGATGCAT 
CATATGTCAA AACATACCTA 
TTTCACGAAA AACGAGGAAT 
AAGAAACCCT AAGACAGCGA 
AGAATTTTTT CTTGGGTGGA 
CGGTTAAATG GTATCAGCTG 



CGACTTAGAG AAGAACTTCT AAAACAGACG AAACTTG7AC AGCTTTTAGG AGGAGTAC-CA 
GAAAAAGTAA GGCAGGCTAG TGGATCAGCC AGACAGGTTG TTCTCCAAAS AAGTATGGAA 
CGAGTACAGT CCTTTTTTCA GAAAAATAAA TGCCGTCTCC CTCTCAAGCC AAGTCTAGTG 



GTCACAATGG TGAATGCTGA CCCTCTGGGA GAAGAAATTA ATSTCATGTT TAAGGTTGGT 

CTTAAAGAAG GACTAGATCT GAGGATGGTA ATTTTCAAAT GTCTCTCAAC TGGCAGAGAT 

GGTGTGACAG GATCCTTTAA AGATAAACCA CTTGCAGAGT GGCTAAGGAA ATACAATCCC 
GGCTTCAGAG AACTTTATCT ATTCCTGTGC TGGATGC1GT 



AACTTGATAA GAAAGCAGAC A 
GGGTTACCAG AACTTACAAG TATTCAAGAT T 
CAAACTACAG ACGCAGAAGC TACAATTTTC TTTACTAGGC T 
AGCATTGCCA CAAAGTTTAA CTTCTTCATT CACAACCTTG CTCAGCTTCG TTTTTCTGGT 



45 00 
4630 
4 8 GO 



STEPIYLSLP 
KARTDLEITD 
KDPWDAVLLE 
TSSLPTGSSL 



I I I I I 

" KECPFSHPEP TRAKDVDKEE ALQMEAEALA KLQKDRQVTD NQRGFELSSS 
DVEKLTQAEL 
TYALPSIYPS 
SLPIYRPWS 



QDYDLMVFPE 
LYFRPTIQRG 
GQSPYFSYPL 
SKVSNLQVSP 
ERSTANCHLE 
LQEVEVQNEE 
EGFQLPVTFT 



QWPPGLPGPS T 
TPATPFHPQG S 
KSEDISKFDW L 
RKVNGKSLSV A 
MAAFCRSITK L 



NSGRSPTDCA 



JSYEiUQ 
QSSKSVKEAW 



F SAMCQNLART 



IRTTOLAKAO 
RTMPGYLLSP 
DLMQVDV3SY 



GFNPRMPTFP 180 

ASTSEFLKNG 240 

EEKNVSSLLA 300 

GHISQKDPNG 360 

VTAQRNICGE 420 

VLKVCGQEEV 480 

NKHLYQIEKP 540 



TPQVDRSIIQ 

TWIEAISDDE LTDLLPQFVQ 
VQFSTRYEHV 
RVQSFFQKWK 
EDLRQDMLAL QMIKIMDKIW 



TTTEQLQFTI 
WDELIIFPIQ 
FLTCGTKLLY 
QHNLETLEND 



TRGSLNPENP VQVSINQLTA 
FAAHG ISSNW VSN Y S K Y Y L I 
ISQLPLESVL HLTLFGILNQ 
LWTSSHTNSV PGTVTKX3YV 
IKGKLLDILH KDSSLGLSKE 
LHQWPALYPL IALELLDSKF 
SLVQFLLSRA LGNIQIAHNL 
KLVQLLGGVA EKVP.QAS3SA 



MERIVLQVDF 



ADQEVRSLAV 960 



RQWLQRSME 
EEINVMFKVG 
DTLRKIQVEY 



TGHMFHIDFG KFLGHAQMFG SFKKDRAPFV LTSDMAYVIN GGEKPTIRFO LFVDLCCQAY 
NLIRKQTNLF LNLLSLMIPS GLPELTS IQD LKYVRDALQP QTTDAEATIF FTRLIES3LG 
I HNLAQLRFSG LPSNDEPILS FSPKTYSFRQ DGRIKEVSVF TYHKKYIJPDK 
« EGQIEPSFVF RTFVEFQELH NKLSIIFPLW RLPGFPNRMV LGRTHIKDVA 
Y LQSLMNASTD VAECDLVCTF FHPLLRJEKA EGIARSADAC- 3FSPTPGQIG 
R NGTLFIMVMH IKDLVTEDGA DPNPYVKTYL LPDNHKTSKR KTKISRKTRN 
S GYSKETLRQR ELQLSVLSAE SLRENFFLGG VTLPLKDFNL 3K3TVKWYQL 
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TTGGTATTTT CACTGTCAAT TATGCCTCGT ATTATTTAT? TATTTGCCAA AATACGACTC- 
A TAGAGCTCAT GACACATAAT AGGTATTCAC TGAGCATTTG 
A TATTTCAGCT CAACAGGCAC ACTAGGGGCC 
AGATGAGCAC TGACTTTCCC CATTGAGGAG TCTCGATTAC CTCATGTCTC ACTTCAAACA 
1 TAGCTGGGTT CAAGAGTTCT TTCTTGTTTT GTCGGATATA 
I AATACAGTAA AGACATCAAA TAGACC 



GATTTCAGTT TCAGGTGATT 

T TTGCCATTAA TTTAAAAGTT CCAGTATCCT 



GGGTTGCATT 



1 11 21 31 41 51 

I I I I I I 

TTTTAAGATG GAGTTTTCGC TCTTGTTGCC CAGGCTGGAG TGCAGTGGTG CAATCTTGGC 60 

TCACTGCAAC CTCTGCCCCC CGGGTTCAGG CGATTCTCTC CTCTCAGTCT CTCAAGTAAC 120 

TGGGATTACA GGCACACACC ACCACGGCCA GCTAATTATT TTGTATTTTG AGTCGAGAGG 180 

AGAATTCACC ATATTGGCCT CAGGTAATCC GCCCGCCTCG GCCTCCCAAA GTGTTGGGAT 240 

TATGGGCGTG AGL f - K3CCCGGC CCGTTTTATT GTTTGAAAAA CAAGTACAGG 300 

TTGTTATTAT CCAAGAATTG TTGATAGAGT ATATACTGTA TTTGAAGTGT AGAACTGAGG 3 60 

CAGAGGCTGA TTAATATAAC TAGTTTACAT TTGTTAGCCT TTCACATCTG TGAAGGAATA 420 

AAGTACAGAC AAAAGTGGAA AACAAACCAG AAAAAAAAAA ATTGTGAAGC ACAGAGCTGC 480 

TTAAAAGAGT GGTGTCACAT TAAAAGAAAA AAGTCACAGA AATAAGTCAG TATTTTGTTT 540 

AGAGACTAGA ACTCCAACTG CTAGCCAACT GCCTAGAATA TAGTAAATAT TTTCTAGTTT 60 0 

CTTAAATGAC TAGTAATATT CCTACATTAT GTGATGGCAT TTCCCAAACT GTTTAATTAG 660 

ATGTTAGATT TGTAGCCAAA TATGTCTAGG AAATGCTTAA ACAATATAAA ACAGTTTTAA 720 

TGATTGGCTT TTTAGAACGT TATATATIAG TGTGCTTTAT GCATATCCAA GAGGTGAGTG 78 0 

AGGTATTTGG GGTTTTTCAG ACTTACTTGA TTACAGATCT GGAGTATCTC AAAACAGTTG 84 0 

TTTTGTGGAA AACACTTTGG CAAACTCTGA GTCTTAGTCA TTAAAAATAG TTTTTGGGTA 90 0 

AACAACAGTG TAATAGAAAT GGAAATTACT GATTCACATT GAGCCATGAA GAATTTATTT 96 0 

TCAGCGATTT TTATAGAAGT TGCTTTATGA CAAAGAAAGC TTTGGTTAAC TGGCATTTGG 1020 

CATTTCACAC CCCTAAATTT TCTACATGAG GATTTATTTC TCTGGTTCTC TCACTTTCTC 1080 

ACTCAGTTAT ACTGAATTCA TTTATGATGA GCGCTCTCAA CCATTCTTAT TCATCAAAGC 114 0 

TGAAGTTGGC AGAGCCCTCI CTGGTACCTG ATTAGAAGTC CGTC1TCCGT CTCATAGGGA 1200 

AGTGTTAGAG ATGGATAATG TTTCTGTGTA GCAGAAGTAG TCATTATGTC CCCTTAAATT 1260 

CGGTCACTTT GACTGCAGTA GAGCTTCTTA GTGAGCAGTC TGTGATGGAG TATACTTTCG 1320 



ATGAGAAATT TAGTGAAAGA TTTAAAATCA TTTTTCAGAC TTTTTCCACA TTAGTTG3GA 1500 

AGCAAACCCC TTTTTTAAGG CAATGTCAGT TATTAAGCTT TAGGGAACCA CAIGCCACTT 156 0 

TAGGTAACAC ATGATTGGAG AGATTGAAGA GTGAAGTCCC TGCTTTAAAG TGTACTCCTG 162 0 
TGGACACAGT AATGCATATA TTTAAAATGG TTCATGTTAA GAGTAGGTAT A 



C TAGGTTTCTT GTGTGGAACT CAGTGGGCAA AATCTTAACT 3. 
TTGATTATTG GTATCACATT TATTAGTCTG TATGTATCTG TGTCATCGAT CTCCTTAAGA 1860 
AGAGACTCGT AGATATTGAC TGGGAGACCC AAGCTGAATG CTAAAATCTG CTCCATGGAT 192 0 
ATAAGCTGAT GCAGTCATCA TTTCACATTA AAATGTACCA CAGCTATATA TGCCGCAAAA 198 0 
AAAAAAAAAA AAAA 



CTACTACTAA ATTCGCGGCC GCGTCGACTT tttjttttt? TTGTCTTATG T 
GCACTGTTCA GCTCTTTTAG GCACTGCAAA GTTGTCTTGA ATTAGGJ 
ATGTGGGCGT GGGTGTTGAC CTACATCTGA ACAATTTACA TATGATTCAC CACAATTAAA 
CAATTTGGTT TGAAATAGCT ATAATTAAGT TATTATCAGA GAAGTAT7TA CTAGTCTAGA 
AATTCTAAAT TTATCTTCAC ATACACCCTA ACT GAGAAAA GGGCCACATT TTCTGCACTC 
TATTAAGTAA AGCAAATGCT GAACTAAATG CCTCCATGTT AACATTTATA TTGTTAAGTT 
ACTGACAGCA TATTCTATGA ATGATTACGT TAGTCGTTTC TTTAAAAATT ATAG3TTTGA 
AATAGCAAGA AAAATATGAA ATGATGGTAG ACAAAAAAGA GTTTCAGTTT CTAACTTCTA 
ACTATATATA TACACACACA CATGCACACA GAATTGCCTT CCCGGATGTA TAGAAATTAI 
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GCCGAATCGC 



GTCCAGGCNC GATGGAAATT ATGGGGGAAT A 



I 



I 



TGGGGCGAGT GGCCGGCATG G 

TGAGAGGCCA GGGACAGGGA GACCGGTGCG ATGGCAGAGC G 
GGGCCGGCCC GGCTGGCCTG AGCCGCCGGA GGAGCGGGGC TGCCTCTGCG C 
GCAGCGGGAA GGGCGAAACT CCGGAGCGCC GCGTCCCTGC GCCGCTGCGG C 
AAGGGGCCGA GCCCGCGCGG A 

CTGCCCGGGG GCGGCGGGGG ACATCGGAGG GCAGCGGAGC GAGCAGCGCC G 
CCGGCGCGGG AGGCGGCCGC AGCAATGCCG GGCCCGCTAG GGCTGCTCTG CTTCCTCGCC 
CTGGGGCTGC TCGGCTCGGC CGGGCCCAGC GGCGCGGCGC CGCCTCTCTG CGCGGC3CCC 
TGCAGCTGCG ACGGCGACCG TCGGGTGGAC TGCTCCGGGA AGGGGCTGAC GGCCGTGCCC 
GAGGGGCTCA GCGCCTTCAC CCAAGCGCTG GATATCAGTA TGAACAACAT TACTCAGTTG 

CTTTCTTTTA TCCACCCAAA GGCCTTGTCT GGGTTGAAAG AACTCAAAGT TCTAACGCTC 
C AGTACCCAGT GAAGCCATTC GAGGGCTGAG TGCTTTGCAG 



GTTCAGTTAC GGCATCTGTG GCTGGATGAC 



CTAGGATTTC 
CTCTTAAGAA 
CACAATTTAT 
CCCAATCTTA 
AGCATACCTA 



TGAGTCAACA 
ACTTGGGGGA 
ATAGTAATTC 
CTATACATTT 



CAGCGTAATC 
ATTCTAGATC 
GGGCCAATAA 



GCAAAAGACT 
TTTTGGGGTT 
AGTGTGGCAC 



CAGGAACTGT 
ATAATTTGTG 
GAGACCTTCC 
AAATCTACCA 
TGAGTAGAAA 
CTAACCTAGA 
TAAATCAACT 



CTGTTTTGAT 
ATTTCCTCAG 
TATTTCTGTT 
GTATGATAAT 
TTCCCTAGTC 



ACCCTGGCTC TCAACAAGAT CTCAAGCATC 
CTGGTAGTTC TGCATCTTCA TAACAATAAA 
GGACTAGATA ACCTGGAGAC CTTAGACTTG 



ATCCCTGATG 
CCTCTGTCTT 
ATTCGTGGTG 



TCAAGAACAA 
AAGTTTTAAT 
AATAAAGGAA 
CCTGATACAT 
TGTAAGTTTC 
GAAACTTGTG 



GAGCATTTGA 
TTGTGGGGAA 
CAAGCATGGT 
TGACAGGTAC 
G3ACTTTGSA 
CTCTGGAAGA 
AAGGCCTGAT 
GTAGAGCTTT 
CTTCCTTTCC 
AGCTGAAAGA 
ATGCTTATCA 
ATAACAGCCT 
r GCAGCAAATG TCACAAGCAC 



AAGATGCTTA 
GGTTGCCATG 
GGCACCTTTC 
GAAATTCACA 
AAT3AA7TAA 
GGCAACTTCA 
TCGGTACCAT 



TGGTAATCCA 
CTCAGCATCT 
GCAGCAGTTC 
AAAGATAAGC 



GAATATTTAC 
TTATTTTTCA 
TCCRAATTGT 
ATCCTAACTT 



TGGGAAGCTG GATGATTCGT CTTACTGTGT 
ACCTGCTTGT 
TTATAGGCTT 
TTCTTGATGC 



TCTTGAAAAT 
TAAGCCCTGT 
CTTGGTTGCA 



ATCTTCACCA 



TAATGCTAGC 
GCAATCATCT 
CAGGCTGTTT 
TTCCTACAGG 
TAGCATTTTT 
ACCTCTCAGA 



TGTGTCCTGG 
AGTAGCTGGG 
AACTGTCGAA 
CAAACAGTTC 
TCCCCTTTTC 
TGAAACGCCA 
ATTAATGGCC 
AAACTCACAA 



GGCAGATTCG 
TTTCTIGCAG 
AGAAGCTTAT 



CATAGAGGGG 
TCATTAGGAT 
GTTATCTACA 
TCTAGCATGA 



CTGCAAAAGA 
CCCTTTCGGC 
AA7ATTCTGC 
TCACTGTAAC 



AAGTTACTGA 
CAAGGTGGTT 
GGCAACCTGA 
AAACACTTGA 



CTATCAGCCC 
TGAATCCAGT CCTGTATGTT 
AGCGACGTGT TACCAAGAAA 



CTGTTTGCGA CTGCTGCGAA 
TAAAATCACA CAGCTGTCCT 
GGTCCGACTG TGGCACACAG 
TTCTGACCAG 



T CTATACTGGC 222 0 



AGAAAGTGCC 
TATAATGAAA 
TTTCCTAGGT 



GTTAGTGCTA 

CGCTTGGCTA 
ACCA1TGATC 



2760 




GCATTGGCAG T3GCTTCTIG C 
TCGGCCCACT CIGATTATGC A 
GTGCAGGCCT GTGGACGAGC CTGCITCTAC 
TAAAGACTGA 
GAGTGAACCC 

GTCATTTTCA A 



TTAAAAATCT ATTTTAAAAT GTGATTTTCT ATAAC1GAAG AAAATATCTT 

T CATCCT1AAT CTCAGGACAA CTTAC1GCAG G3CCAAAAAA GGGACTGTCC 
C TGTGAGAGTA TACATAGGCA TTACTTTATT ATGTTTTCAC TTGCCATCCT 
TGACATAAGA GAACTATAAA TTTTGTTTAA GCAATTTATA AATCTAAAAC CTGAAGATGT 
TTTTAAAACA ATATTAACAG CTGTTAGGTT AAAAAAATAG CTGGACATTT GTTTTCAGTC 
ATTATACATT GCTTTGGTCC AATCAGTAAT TTTTTCTTAA GTGTTTTGTG ATTACACTAC 
TAGAAAAAAA GTAAAAGGCT AATTGCTGTG TGGGTTTAGT CGATTTGGCT 
CTAATGTGGG GGTTTAATAG TATCTGAGGG ATTTGGTGGC TTCATGTAA1 
ATGAATACTT CCTAATATCG TTGGCTCTAC TAATATTTTC CAATTTGCTG GGATGTCACC 4200 
TAGCAATAGC TTGGATTATA TAGAAAGTAA ACTGTGGTCA ATACTTGCAT TTAATTAGAC 4260 
GAAACGGGGA GTAATTATGA CACGAAGTAC TTATGTTTAT TICTTAGTGA GCTGGATTAT 432 0 
CTTGAACCTG TGCTATTAAA TGGAAATTTC CATACATCTT CCCCATACTA TTTTTTATAA 43 80 



AAGAGCCTAT TCAATAGCTC 



CTCTGGTTAA ACAAGATAAT 
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50 



CTTGATTCCC ATCTATGGGC TTTAGACCTA TTACTGGGTG SAGTCTTAAA GITATAATTG 4560 
TTCAATATGT TTTTTGAACA GTGTGCTAAA T 
TTCTGAATAT ACTAAAAAAA TCCAGCTAGA T 
GTGCATATAA T 
GGCTTCTGTT C 



GAAGGATTTA TTTACAGTGT GTTGTAATTT TGTAAGGCCA A 



AAAAACTAGA ATAACAGATA TATAAAAGTG TTAATCTTT3 TGCTATATGG TATGAAATAC 5160 



I I I I I I 

MPGPLGLLCF LALGLLGSAG PSGAAPPLCA APCSCDGDRR VDCSGKGLTA VPEGLSAFTQ 
I QLPEDAFKNF PFLEELQLAG NDLSFIHPKA LSGLKELKVL TLQNNQLKTV 
A LQSLRLDANH ITSVPEDSFE GLVQLRHLWL DDMSLTEVPV HPLStCLPTLQ 
)FAFTNL S3LWLHLHN NKIRGLSQHC FDGLD^LETL DLSYN1ILGEF 
PQAIKARPSL KELGFHSNSI SVIPDGAFDG NPLLRTIHLY DNPLSFVGNS ASHMLSDLHS 
LVIRGASMVQ QFPNLTGTVH LESLTLTGTK ISSIPNNLCQ EQKMLRTLDL SYNNIRDLPS 
FNGCHALEEI SLQRNQIYQI KEGTFQGLIS LRILDLSRNL IHEIHSRAFA TLGPITNLDV 
SFNELTSFPT EGPNGLNQLK LVGNFKLKEA LAAKDFVNLR SLSVPYAYQC CAFWGCDSYA 
NLNTEDNSLQ DHSVAQEKGT ADAANVTSTL ENEEHSQIII HCTPSTGAFK PCEYLLGSWM 
IRLTVWFIFL VALFFNLLVI LTTFASCTSL PSSKLFIGLI SVSNLFMGIY TGILTFLDAV 
SWGRFAEFGI WWETGSGCKV AGFLAVFSSE SAIFLLMLAT VERSLSAKDI MKNGKSMHLK 
QFRVAALSAF LGATVAGCFP LFHRGEYSAS PLCLPFPTGE TPSLGFTVTL VLLNSLAFLL 
MAVIYTKLYC NLEKEDLSEN SQSSMIKHVA WLIFTNCIFF CPVAFFSFAP LITAISISPE 
IMKSVTLIFF PLPACLNPVL YVFFNPKFKE DWKLLKRRVT KKSGSVSVSI SSQGGCLEQD 
FYYDCGMYSH LQGNLTVCDC CESFLLTKPV SCKHLIKSHS CPALAVASCQ RPEGYWSDCG 
TQSAHSDYAD EEDSFVSDSS DQVQACGRAC FYQSRGFPLV R 



I I I I I I 

ATGCT&I LA 1 A G&oCAACTAC 

CCCAGTCCCA TCCCGAAATT CCACTTCGAG TTCTCCTCTG CTGTGCCCGA AGTCGTCCTG 
45 AACCTCTTCA ACTGCAAAAA TTGTGCAAAT GAAGCTGTGG TTCAAAAGAT TTTGGACAGG 
GTGCTGTCAA GATACGATGT CCGCCTGAGA CCGAATTTTG GAGGTGCCCC TGTGCCTGTG 
AGAATATCTA TTTATGTCAC GAGCATTGAA CAGATCTCAG AAATGAATAT GGACTACACG 
ATCACGATGT TTTTTCATCA GACTTGGAAA GATTCACGCT TAGCATACTA TGAGACCACC 
CTGAACTTGA CCCTGGACTA TCGGATGCAT GAGAAGTTGT GGGTCCCTGA CTGCTACTTT 
TTGAACAGCA AGGATGCTTT CGTGCATGAT GTGACTGTGG AGAATCGCGT GTTTCAGCTT 
C CGACTCACCA CTACAGCAGC TTGT 



TACACGGTTG AAGAC AT CAT ATTATTCTGG GATGACAATG GGAACGCCAT CCACATGACT 

CACTTTC CTGGGAAGGA C 

A CATACGCCTG ATACTGAAGT T 
T CTACTGGCCT ACTGTCC 

TCGTTTTGGA TGAACTATGA TTCCTCTGCA GCCAGGGTGA CAATTGGCTT A 

CTCATCCTGA CCACCATCGA CTCACATCTG CGGGATAAGC TCCCCAACAT TTCCTGTATC 



CGCTACCAGC AAGTGGTG3T AGGAAACGTG 



GCCCCCCTGG CAAGCCCGGA AAGCCTCGGT TCTTTGACGT CCACCTCCGA GCAGGCCCAG 
CTGGCCACCT CGGAAAGCCT CAGCCCACTC ACTTCTCTCT C 
ACTGGAGAAA GCCTGAGCGA Ti 
GTTCGCTTTA ATGGTTTCCA GGCTGATGAC AGTATTTTTC CTACCGAAAT CCGCAACCGT 
GTCGAAGCCC ATGGCCATGG TGTTACCCAT GACCATGAAG ATTCCAAT3A GAGCTTGAGC 



CCTGGGTGCT CCTTCACTGA AGGGTTCTCC TTCGATCTCT TTAATCCT3A CTACGTCCCA 1800 
AAGGTCGACA AGTGGTCCCG GTTCCTCTTC CCTCTGGCCT T 
TACTGGGTAT ACCATATGTA TTAG 



[> IRTWLAEGNY PSPIPKFHFE FSSAVPEWL NLFNCXNCAtT EAWQKILDR 
VLSRYDVRLR PNFGGAPVPV RISIYVTSIE QISEMNMDYT ITMFFHQTWK DSRLAYYETT 
LNLTLDYRMH EKLWVPDCYF LNSKDAFVHD VTVENRVFQL HPDGTVRYGI RLTTTAACSL 
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DLHKPPMDKQ ACNLWESYG YTVEDIILPW DDNGNAIHMT EELHIPQPTP 
YFYTGSYIRL ILKFQVQREV NSYLVQVYWP T 
LILTTIDSHL RDKLPNISCI KAIDIY] 
RRPRRVIARY RYQQWVGNV QDGLINVEDG VSSLPITPAQ APLASPESLG 3LTSTSEQAQ 
L TSLSGQAPLA TGESLSDLPS TSEQARHSYG VRFNC-FQADD 3I?PTE"RNR 



I 



I 



I 



I 



3 GACAGGGTGC TGTCAAGATA 
CGATGTCCGC CTGAGACCGA ATTTTGGANN NATGCTTGCT ACTAACAGTA CCCGGGGCCT 
TAATGAAGAT GAGCTCATGG CCCATGGCCA AGAGAAGGAC AGTAGCTCAG AGTCTGAGGA 
TAGTTGCCCC CCAAGCCCTG GGTGCTCCTT CACTGAAGGG TTCTCCTTCC- ATCTCCTTAA 
TCCTGACTAC GTCCCAAAGG TCGACAAGTG GTCCCGGTTC CTCTICCCTC TGGCCTTTGG 
GTTGTTCAAC A' 



Seq ID NO: 



sequence 



GGCGTCCGCG CACACCTCCC CGCGCCGCCG C 



CGGTGCTGCT GCTGCTGCTG 



ATGACTGCCA 
AGCCTGGCTA 
TCAATGGAGG 
TTGATGGCTT 
AGAACAATGG 



CCGCCACCGC CCGCACTCCG CCGCC1CTGC 
CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 
CACTCCTCC? GCTGGCGGGC GCCCTCCCCC 
AGGATGTAGA TGAGTGTGCC CAAGGGCTAG 
TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCIACAAG TGCTCCTGCA 



CCAAGGGGAA G 



T GTGAGGACAT CGATGAATGI GGAAATGAGC 3 60 



CATGTTGGCT 



CATGACGGTC 
CATACCTGTG 
AGTGACAATC 
GATCACGGCT 
AGGCCTGGTT 
AACGGTGGGT 



A1AATTGTCT 
TCAACGTCAT 
AGCACACCTG 
GTAGTCACAT 
TTGAGCTGGC 



TGATGTGGAC; 



CATTCACCGC TCGGAAGAGG 
CTGCAAGGAG GCCCCAAGGG 
CAAGAACCAG AGAGACTGCA 
CTGTGACGAT ACAGCCGATG 
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TCCAGGTCCC ATACGTGACA TATGATGAGG ACTACCAGGA 



T ACGTTCCAAA GTGTCCAGGT 3060 

4 CTCAGCCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 

GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATCAGT GACTCATTAG A 



3 CGGATGTCTT G 
TTGGTCAGCC TAGGTGAGAC TCACCTG 

TGTAGTGGAA AGGAGGCCAC AGAATAAGCT GCTTATTCTG A 

CCGGCCCTCT CTAAGGGAGC CCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3S40 

CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGA3 ACCTGGGAGG 3600 

ACTCAGTTTC TCCACAGCCT TCTCCAGCCT GTGTGATACA AGTTIGATCC CAGGAACTTG 3 660 

AGTTCTAAGC AGTGCTCGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3 72 0 



Protein Acces 



I I I I I I 

MGVAGRNRPG AAKAVLLLLL LLPPLLLLAG AVPPGRGRAA GPQEDVDECA QGLDDCHADA 
LCQNTPTSYK CSCKPGYQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 

25 HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTCIHR SEEGLSCMHK 

DHGCSHICKE APRGSVACEC RPGFELAKNQ RDCILTCNHG NGGCQHSCDD T " 
PQYKMHTDGR SCLEREDTVL EVTESNTTSV TOGDKRVKRR LLMETCAVNW Q 
STGVHCSCPV GFTLQLDGKT CKDIDECQTR NGGCDHFCKN IVGSFDCGCK KGFKLLTDEK 
SCQDVDECSL DRTCDHSCIN HPGTFACACN RGYTLYGFTH CGDTKECSIK MGGCQQVCVN 

30 TVGSYECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLRCHSGIHL 
SSDVTTIRTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV MLTCSSGKQV 
PGAPGRPSTP KEMFITVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 
FHLOLSGMML DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA RERCILCPNG 
TFQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYSADGFAP CQLCALGTFQ 

35 PEAGRTSCFP CGGGLATKHQ GATSFQDCET RVQCSPGHFY NTTTHRCIRC PVGTYQPBFG 
KNNCVSCPGN TTTDFDG3TN ITQCKNRRCG GELGDFTGYI ESPNYPGNYP ANTECTWTIN 
PPPKRRILIV VPEIFLPIED DCGDYLVMRK TSSSNSVTTY ETCQTYSRPI AFTSRSKKLW 
IQFKSNEGNS ARGFQVPYVT YDEDYQELIE DIVRDGRLYA SENHQEILKD K 
LAHPQNYFKY TAQESREMFP RSFIRLLRSK V 

40 

Seq ID NO ! 43 DNA sequence 



TTTCTTCATT TTATGCTTTT CTCCCCTTTA TATATACTGG GCGGTTTTTC CTTGAGAAAT 
TTTCCATCTC ATTAATTCTC CTGCAGCAAT TCATAACTCT TTGGGGGCAT TCCTTTGTTT 
TTTGATATGA CTACTACCTG ACTGTATATA GTTTCCCTTT TTT-'TTTTTC CTCCCAGATT 
CTCTCCTTTC TACTGGCATC CTTTTCCATT TTACTCAATT TTCCTCAGTT AGGTTGACTT 
GCTTTTATAC CTGTGTGATG CTCCTTGCCA GATATCTAGC AAATGCCCCC AGGATCCAAT 
CATTTTTTTC CTAAGAAAAC TGAAAAGAAG CATGGCAAAT AACAGAGCTT GGAAAATAGG 
AAACTTTAAA ATACAAAGCC CAGTGAAATC TACTTGGAAG CCAATGCTTA GAGGCAAGAG 



TCACTTTCCA ACATTGGAAA GTTATGCATA TTCCAATTGA GCTAGCCCTT TTAAACAGCC 

TTAAAATTGT ATAAAAGAGA AGAAATTTAA GATATTGAAA ACTGGTAGAT AATAAAACCT 

AAATAAAGCT GGTTTTGGAA GAGCAGTGGC CACTGTGATT GACAATGGGG GCACTTACTG 

TTAAGGGGAT TTATAACAGA AGTACTTGAA CAGAATTGTG AAGAGAATAG AATTGTGCAT 

TCTTTTATCT GCCCAGAACC ACAGCTCCCA TGGGAAATAC TCCACCTCAI TCTACAACCT 

TCTGGCTGCA ACAAAAGCAG TCAAATTAAA ACATAACCCA AAGGGGGTAC CTAACCCAAC 

TTGAGAAAAT CATAGCATNC TCCCTTTGGC TATAACTNTT TCCACATGAA ATACATTCAA 



\ TTTTAGTATG CCTTGCAATT TTTTCCCTTT ATTCTGATGC 



GACAGGGCAG GTGATGCTCT CTTAGTCTCT TTAGGCTACT A 
GAGTAATTCA TAAACAACAG AGATTATTGT TCACAGATCT GGAGGCTGGA A 
CTAAAGGGCC AGAATATTTG GTGTTTGGTG AAGGTCAAAC ATTCAGACAC TCTCAACGAC 
TATAGCGACA GCAGCAGTCT TCAGGAATCC TATGTGAGGG ACAAACACTC AGAAGCCAGC 
IATCCTA TGTGAGGGAC AAACATTCAG AGCCCAGCAG TAGTGTTGTG 
G TGAGGGACAA ACTTTCAAAC CCTTGTAGCA GTGTTCTGGA ATCCTATGTG 
AGGGACAAAA ATTCAGAACC TTGTAGCAGT GTTCTGGAAT CCTATGTGAG GAACAATCA 
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GCCAAGCACG A 



3 CGCGCCTTCC 



25 
30 



GCCCCCCGGG ACCCCACGCG CGCCCCGGCC CGGCCCGGGG A 
CAAGCGCCTT GGTGTTCGCT CCGGCCGACG AGCCGCGGAC GGTCCTGGAG AGGAAGCCCC 
TGCCCCTGGG CGTGCGCGCC CCTCTGGCCG GTCCCAGCGC CGCCGCCCGC AGCCCGGAGC 
AGCTGTGCGC CCCGGCTGAG GCGGCGCCCT GCCCCGCGGA GCCCG 

CGGCGCTGGA ACCGAGCTCC AGCGCGGACG CAGGGACGGG ACCGGGGAGC GGCTCTTCGT 



GCGGTGCAGT GGCGTGCAGG G 

G GAACGTCGGA GGCACACCAT CGCCAGCGGC GTGGACTGCG G 
G GAGCTGGAGC A 

T GCAACGAGTG CAGGAGCGCC AGCGCCGCCT 



CACCCAGGAG 



CGTCCTCCTC CGGGCCCCCC 
GTCTGGCAGC AGCAGACCAT CCTCATGCTG 
GTGACCGAGA 



AGAAGTCGGC 



TTCCACCTTC 

CCCTTCGAGG GTGGGCGCCC CATCGCACCC ACCCTCTCTG 
CCAGGCACAG 
CTGCCCCCGG 



CAGCTCCCAG G' 



AGGGAAAGGC ACTGCCCACG CCAGGCTGCA 
GCGGCTCCGA CGCGGGTCCA AGGGCAGCTT 
GCTGTAGCTC GGACGGACGG AAGTAGATGG 
TGCCTGCCTG GCTGGGGAGC CCCAGGGATA 
GAGGGACCCT GGCTGCAGCG 
CCTCCCACAG ACCCTGGGGT GATGGCCTTC 
GAGTCCCACA CAACATCCTG TGAGCCTGGC 

C TCACCCCCCT TTGCTCTCAC GCCCAGCCTG 
T CCCTCAGCCA AGGAAAACGA GAACCCCCAG 
T TGGGTCTCAC 



CCGCCCTGGG 
CTTCCAACAA CGGGCAGCAG 
CCCGCTCAAC CAGGGCACCA 
AGGGGGTGGG GACGGCCTGT 
TTCAGGTTCT 



C CCTGCCCAGG 
T GGTGGGGGGG CCCTGCTCAG CCCAACCTGG 
!\ CCCCAACAGC ATCGATGGGT TCTGCAGCCC 
3 CCTCCGATGC GGGGTCAGTG 



TCCCCAGGAG GGCCCCCAGA 
ACTCAGGATT TCCAAGGCCT 
"CCCCAGGTT TCAGCTGGGA 
GGTACAGGAG GAGGCTGGGG 



CGATGCGGGG TCAGTGCGTG GGGGGCGCAG G 



ACTGTCCCAC 



TCCTACCCTG 



GGACTGCAGC 
GGGG^CTGGG 
GGATCCTGGC 



AAGGCACCTG TCTCAGAGGA GGGGCCCTGG 
AGCTCCATGC TAACCTGCCC ACAGCAACCC 
CTGCAGGGTG TCCCAGGACA CGCCCAAGTC 
AAGATGGGAG TGGGCTTTCC AGGGGACATA 
AAAGGGTG CA GGTCCTGAGG GCCTGTGCCC 
GCAGTGGGTG GGCCAGTGGC AGCCAGGGAG 
CACCAGGGCC TCCCCACGTC TGCCTTTGAG 
ATCTTTACTG GACTGGAAGC AGGAGACAGA 




AAATAGGTCC 



ACCCTGCCTG 
AGCGTCCCTG 
GCACACTGTG 
AGTGTCCCCA 
GTGTTGATCA 



CCTCCGGAAA AACTGCCTTT CAGCCTTGGT 
TCCCAGTTTA CAGCTTGAAA TCAGGCTAGT 
TTAAAGGCCC CGGCTGGCAG GGTCTAGGTG 
GAGCCTGCCC TAGGACGCTG GGCGGGTCAG 
GGCTCTATCC GCGAGGTGCC AGTAGCGTGT 
ATGACACCCG GAAATGTCTC AGGATGTTGA 
GTTGAGAATC TGCCCCAGAG 
AGTTCCAAGG 



3000 
3120 



AGAAAGAAAG A 



3AGT GAGAGACCCC 



TTTCTGGAAA CATGAAAAAA A 



ATCTAAAGAA 



I 



I 



I 



I 



MKVESRGPPS CWLRARASNS CLMSADFSCS SCVMRSLFSV TSWVRSRFCS FSKRMVCCCQ 
3 QGGPEEDGGR ARLAQAAASS SPRHRAT3CT LGS3RPSGRG LPAAPK3ALA 
3 CTRCSCCWYQ SRPRAIISKP CSSTSFSCSS SFICFSRPQS TPLAMVCLRR 
3APGPCT PLHRDKHEAL SLQTRRGALQ DPESTKSRSP VPSLRPRWSS 
R APRGRAPPQP GRTAAPGCGR RRWDRPEGEA RPGAGASSFG PSAARRPERT 
j IPGPGRGARG VPGGPPSALQ EGGAQVHAAA PPWRMSHRV RRL 



Coding sequence ; 



CAAAGCTCTA A 



A GCGAATTAAC 

CTGATTTATA GTCCTGTACT TTCTCTACGT GCCATATCCA TTATTAAAGA AATGAGTCTA 
T TTCTTATTCT CTTTCTTCAG 
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TCTTTTTCAG TTAACCTACA CACACACACA CACACACACA CACACACACA CACATATGTT 
C GGGTACGGTG ATAATTAAAA GAGGTAAGGT TTCTCTTGAG 
T TCTAAAATTG TGATGGCGGA TGCACACCTC TGAATATATT AAAAGCCATT 
GAAATGAAAA AAGGGTGGGG GGAATCCAAA AGTGTAGCAG ACCCAACCTT GAGATTTGCT 
TGTTTGGGAA TGAATTTTCC AATAACTTGA AAGTTGTAAA AACTCACACT TCTCAGGGTT 
AGGTGTCAGA AAGAAAAGGA AGTAATTTAT TCTTTAATAA AGCAATTGTT AAATACTCTT 
?TGCAGT GTCTACTCAT AGTGTCTATA TAGGTACCAT 
A ACTGTTCTCA TGTTACTTCA GAAAAATTTT GCTTCTAAGT 
GTGTATTCTA TGTCTGGTTA AATGTTCATT GAATTTTATT TAATCATTAA TCTCAACAGC 
ATTAAACAGT CAATAACATA AATGACAGTC TTCTCTTTGT ACTCCTCCC? GTACAACATC 

acagagctcc atctgtatac acgaaagtca catgaaaata gaactcagtg ttttgtatta 
t tcagtacatt tagaagtatt ttgcctccaa tattcaacca cagtaaaaga 
:tgcagg ttaagatgac ggaaaataca actgcctacg 
cagctccagg atccagcaaa ccgtttccca aagcctggaa gcaaaac 
GAGCGAACGT gagtgtgaaa cctctttaag acaccgttgg gctgcttggt t 

A CTAGGATCCT GGGGATACAT GAAGCTTCTG T< 
A AAGCAATGGA GATTGGATGG ATGCACAATC GGAGACAAAG GCAAGTCCTT 
GTTTTCTTTG TTTTGCTGAG CTTGTCTGGG GCGGGCGCCG AGTTGGGGTC CTATTCCGTA 
GTGGAAGAAA CGGAGAGAGG CTCTTTTGTG GCAAATCTAG GAAAAGACC7 GGGGT7GGGG 
TTGACAGAGA TGTCCACCCG CAAGGCCAGG ATCATTTCCC AGGGGA? 
CAGCTCAAGG CTCAAACTGG GGATTTGCTC ATAAATGAGA AGCTAGATCG A 
TGCGGTCCCA CTGAGCCTTG CATACTACAT TTCCAAGTGT TAATGGAAAA CCCTTTAGAA 



AAGGAAATGA TTCTAAAAAT ACCGGAAAAC AGTCCTCTAG GAACTGAGT? CCCTCTGAAT 

CATGCTTTGG, ACTTGGACGT AGGAAGCAAT AATGTT CAAA ACTATAAAAT CAGCCCAAGC 

TCTCATTTCC GGGTTCTAAT CCATGAATTC AGAGATGGCA GGAAATACCC TGAGCTAGTG 
TTGGATAAAG AGCTGGATCG GGAGGAGGAG CCTCAACTAA G 



AATGATAACG CTCCTGAGTT TGAGCAGCCC ATCTACAAAG TGCAGATTCC AGAGAACAGT 
30 CCTCTTGGCT CCCTGGTTGC CACCGTCTCC GCCAGGGATT TAGACGGCGG A 
AAAAT AT CAT ACACACTCTT TCAGCCTTCG G. 

CCTATGACAG GGGAAGTTCG ACTGAGAAAG CAAGTAGATT TCGAAATGGT TACGTCTTAT 2100 

GAAGTGOGCA TCAAAGCCAC AGATGGGGGA GGTCTTTCAG GAAAGTGCAC TCTTCTCCTG 2160 

CAGGTGGTGG ACGTGAATGA CAATCCCCCA CAGGTGACCA TGTCTGCACT CACCAGCCCC 2220 

35 ATCCCAGAGA ACTCGCCTGA GATAGTAGTT GCTGTTTTCA GCGTTTCAGA TCCTGACTCC 2260 

GGAAACAATG GGAAGACGAT TTCCTCCATC CAGGAAGACC TTCCCTTTCT TCTAAAACCT 2 340 

TCAGTCAAGA ACTTTTACAC CTTGGTAACG GAGAGAGCAC TCGACAGAGA AGCAAGAGCT 2400 

GAATATAATA TCACCCTCAC CGTCACAGAT ATCCGGACTC CAAGGCTGAA AACGCACCAC 2450 

AACATAACAG TGCAGATATC AGATGTCAAT GATAACGCCC CCACTTTCAC CCAAACCTCC 2520 

40 TACACCCTGT TCGTCCGCGA GAACAACAGC CCCGCCCTGC ACATCGGCAG CGTCAGCGCC 2 580 

ACAGACACAG ACTCACGCAC CAACCCCCAG CTCACCTACT CGCTGCTGCC GCCCCAGGAC 2640 

CCGCACCTGC CCCTCGCCTC CCTGGTCTCC ATCAACGCAG ACAACGGGCA CCTGTTCGCC 2700 

CTCAGGTCGC TGGACTACGA GGCCCTGCGG GAGTTCGAGT TCCGCGTGAG CGCCACAGAC 2 7 60 

CGCGCCTCCC CCGCTTTGAG CAGCCAGGCG CTGGTGCGCG TGCTGGTGCT GGACGC CZAAC 2 82 0 

45 GACAACTCGC CCTTCGTGCT GTACCCGCTG CAGAACGGCT GCGCGCCCTG CACTGAGCTG 2B80 

GTGCCCCGGG CGGCCGAGCC GGGCTACCTG GTGACCAAGG TGGTGGCGGT GGACGGCGAC 2 940 
TCGGGCCAGA ATGCCTGGCT GTCGTACCAG CTGCTCAAGG CCACG_ I 

GGTGTGTGGG CGCACAATGG CGAGGTGCGC ACCGCCAGGC TGCTGAGCGA GCGCGACGCA 3 060 

GCCAAGCAGA GGCTGGTGGT GCTGGTCAAG GACAATGGCG AGCCTCCGCG CTCGGCCACC 312 0 

50 GCCACGCTGC ACGTGCTCCT GGTGGACGGC TTCTCCCAGC CCTTCCTGCC GCTCCCAGAG 3180 

GCGGCCCCCG GCCAGACCCA GGCCAACTCG CTCACTGTCT ACCTGGTGGT GGCGTTGGCC 3240 

TCGGTGTCGT CGCTCTTCCT CTTTTCGGTG CTCCTGTTCG TGGCGGTGCG GCTGTGCAGG 3 3 00 

AGGAGCAGGG CGGCCTCGGT GGGCCGCTGC TCGATGCCTG AGGGCCCCTT TCCAGGGCGT 3360 

CTGGTGGACG TAAGCGGCAC CGGGACCCTG TCCCAGAGCT ACCAATACGA GGTGTGTCTG 342 0 

55 ACAGGAGGCT CAGAAACAAG TGAGTTCAAG TTCCTGAAGC CGATTATCCC CAACTTCTCT 3480 

CCTTAGGGCA CTAGGAAAGA AATAGATTAA AATTCCACCC TTCACAATAG CTTTGGATTT 3540 

AATTATTGAT AGGAACCCAT TTGATAAATT CCTTAACTTC TTATGATTGT CTTGTTGATT 3 60 0 

AAATTGTTCA TGCTCACCAC CACCAATAAG GTATTTTTCT CTGATTGTTA GTTCAAATTA 3 66 0 

rn TATTGTTAAT TCCAGTTTCC CTTTTCCTCA TATTTACCCC GAAGAGGTGT TGCATATAGA 372 0 

A ACAAAATATA CTTTATCTTC AAAGTTGATG TCATTTAAAA TTTTTCCGTC 37 80 

A CTTCTCAGTT 3840 
TCCTAGAACT TCAAGTATTA AAATAACCTG TTGCATGTAT TAGGCATATT T 
CATTTCTTTT GTCTATTTTC C 
TAATACTTTT C 

G GGTCTTACTC TTGTCACCCA GGCTGGAGTG CAGTGGCACA ATCTTGGCTC 

T CTGCCTCCTG GGCTCAACGG ATCCTTCCAC CTCAGCCTCC CAAGTAGCTT 4140 

GGACTATAGG TGCATGCCAC CATGCCTGGC TAATCTTTTG CAGCGATGAG ATTTTGCCAA 4200 

GTTGCCCAGG CTGATCTTGA ACTCCTGGGC TCAAGCCATC CTCCCTCCTC AGCCTCCCAA 4260 

T AAGCCAATGT GCCCATCCAA AGTTTTATTT ATTTATTTTT 4320 

h GTCTCGTAAA GTTACCTTTA AAAAAAAAGT TCTAT1TTCC CTGTATTGGT 4380 
ATCTCCTTAA ATAAAATAAA ATATTCCTAT TGTAAGTGAT A 
CTTATCTAAA A 

TGTTTGTAGA CAAAAGGCAA AGGTATTATG TAAAAATATT TAATAATTTA TTCTTTCTAT 
A AAAAATCAGA GGTCCCTGTT ATATTTTTAA TGGCTAACAA CTCAATCTCA 
A AAAAAAACTT ATCAAAGAGA CATTTACATG GTTTGGCTTT T. 
ATAGTATACA TTGGCGGTAT CTAGCCCTTT CTCTGTAAAA TATCCCTATG T 
ATTTCTTGCT TATTATATGT AAAGTTGAGC TTCTTTCTAG ATATTAGGCC T 
ATTCTATGTG AGTCAGAAAA AAAAAAA 

Protein Accession #: NP_066008.1 
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MEIGWMHNRR 
TRKAEIISQG 
LRVIDINDHS 



GTNAQVTYSL 



PMFTEKEMIL KIPENSPLGT EFPLNEALDL DVGSNNVQNY K 
YPELVLDKEL DREEEPQLRL TLTALDGGSP PRSGTAQVRI E 

GGANGKISYT LFQPSEDISK TLEVNPMTGE 
CTLLLQWDV NDNPPQVTMS ALTSPIPENS 
SDPDSGNNGK TISSIQEDLP FLLKPSV 
LKTEHNITVQ ISDVNDNAPT FTQTSYTLFV R 
LPPQDPHLPL ASLVSINADN GHLFALRSLD YEALREFEFR V 
VLDAMDNSPF VLYPLQNGSA PCTELVP 
EPGLFGVWAH NGEVRTARLL SERDAAKQRL WLVKDNGEP PRSATATLHV 
LPLPEAAPGQ TQANSLTVYL WALASVSSL FLFSVLLFVA VRLCRRSRAA 
GTGTLSQSYQ YEVCLTGGSE TSEFKFLKPI IPNFSP 



TTTTTTTTTG ATAATACACA 



I 



I 



I 



TTCCTAGGGT TGCTTTTGCT 
TATACTAAAT AAAGAAAAAT 
AACCAAATAA AAATTGTGCA 
TACAATGTTT 
TAAGAAAGTA 
TTAAGATACA 



seg I 



TCTCTAAATA 

CATCTAAATG TATGTTTATA TATTTTA7TT GTGCATTTTA 
TTAGTTTGTA AAACGTTCTT ATTTTTATGA TAATGTAGTA 
CAGGAAATAG AAAATGAAGA AGAAAACATT AGCTATTGTC 

GGGTGAGGTA GAGACTGCAA AACATTGAAC CTGGGACAAA 
GAAAATGTTG AACTT. 
TCTAGAAAAT TTA 



d Accession #: AF034799.1 



I 



I 



C AGGAGTGCTG 



AGTGATGCCC A 1 



GGATGTCATC 
TATCGAATCC 
TGCACTGACA 
CTCTGAACTT 



TATGACCGAG 
CTAACAGGAG 
AAAGAATTAA 
AAAGCTGAAA 



GGTGTGAGCC 
GCAAGGACCC 
AGGACACCCC 
TTGAGCAGCT 
AGACCCAGGA 
ACTCACTCCA 



CCTCACCGGC TGCCAGCAGG 
TCCCTTCTCC TCAAGCCGGA GACTGCGGTT 
GAAATCACAG ACATTAGCAA TGATGTGTGA 
AATGAGCCAA AGGGGGTCCC AAAGCAGTGG 
GATGGTGAAT ATGCTAGATG AAAGGGATCG 
AAGCCTCTCA CTTGCCCAGC AAAGACTTCA 
GAGACAGCTC AATTCAGCCC TGCCACAGGA 
GCTGATCCAC CGGAATTTGC 



CAAGGCCTTG 
TCATATACAA 
CGATGAAACT 



CTGCTAATCA 
GGACAGAAAG TCCATGAGAA 
TAGCAGCCCT 



AAGACTATTA CTGGAGCATT TGGAGTGCCT 
GACGGTGGTA AAACGGCAAG CCCAGTCTCC 
AAATCTTTGT TTGAGCACCA 
TCTTTAGAAA GAGTCTCTGC 
GGAGATTGTT GCCTTGCGTG AACAAAATGT 
GGGATCCACA GAGTCAGAAC ATCTTGAAGG 
GCGTTTGTCC AATGGTTCTA T 



CTCAGTGCTC AGAGAGAATC 



GAATCAAGAA 
CAGAAAGAAT 



CTGAACTGGC 

CTAGGCAAAG 
GACTTCTGAC 
TAGAAGAAAA 



AATGAGGAGC ATAACAAGAG 



GTACTAAGCA 

ACGCTAGCCA 
CAGGAAGAAA 



GCCACCCTTT 



TCAGTGGGAT CCCTAGTGGA 
AATAAGAAGA CCAAGGAGAG GCCGCATGGG 
TCTTGGGGAT CACGAGTGGA ATAGAACTCA 
TGAAAGTGAC ACTGAAATGT CTGATATTGA 



AGAGTTGCGT GCTGAAGAAA TTGAAAATAG 222 0 



TCCAAAGCTC 



CCCCAAGAAG 

GTTAGGCAAA 
TCTTGAAGAA 



CATCGCTGGC CAGT7CATCT CCCCCCAGTG GACACTCAAC 
GCCCTGCCAG GGAAATGGAT CGGATGGGAG TCATGACACT 
ATCGGAGAAA GATTGCAGTT GTG3AA - 3 ~ < 
ACAATTAAAT GTGAAACTTC TCCTCCTCCT ACCCCTAGAG CCCTCAGAAT 



TTGGTAGTGC CAACAGCAGC CAAGACTCTC TTCACAAAGC 
AGTCTTCAAT AGGACGTTTG TTTGGTAAAA A 
GCTTTATGGA GACTGAAGCT GCAGCTCAGG A 
AAGCTGAGAA GGATCGAAGA CTAAAGAAAA AGCATGAACT 
AGGGATTACC TTTTGCCCAG TGGGATGGGC CAACTGTGGT 
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CGCATGGCTA GAGCTTTGGT TGGGAATGCC TGCGTGGTAC GTGGCAGCCT GCCG 
CGTGAAGAGT GGTGCCATCA TGTCTGCTTT ATCTGACACT GAGATCC 

AATCAGCAAT CCACTGCATC GCTTAAAACT TCGATTAGCA ATCCAGGAGA TGGTTTCCCT 
AACAAGTCCT TCAGCTCCTC CAACATCTCG AACTCCTTCA GGCAACGTTT GGGTGACTCA 
TGAAGAAATG GAAAATCTTG CAGCTCCAGC AAAAACGAAA GAATCTGAGG AAGGAAGCTG 
GGCCCAGTGT CCGGTTTTTC TACAGACCCT GGCTTATGGA GATATGAATC A 
TGGAAATGAA TGGCTTCCCA GCTTGGGGTT ACCTCAGTAC AGAAGTTACT T 
CTTGGTAGAT GCAAGAATGT TAGATCACCT AACAAAAAAA GATC7CCGTG TCCATTTAAA 3360 
AATGGTGGAT AGTTTCCATC GAACAAGTTT ACAATATGGA ATTATGTGCT TAAAGAGGTT 342 0 



TGAAAGTGAT GACAAGAACT TCAGACGTGG ATCAACCTGG AGAAGGCAGT TTCCTCCTCG 
TGAAGTACAT GGAATCAGCA TGATGCCTGG GTCCTCAGAA ACATTACCAG CTGGATTTAG 
GTTAACCACA ACCTCTGGGC AGTCAAGAAA AATGACAACA GATGTTGCT7 CATCAAGACT 
GCAGAGGTTA GACAACTCCA CTGTTCGCAC ATACTCATGT TGACCAGCCA CTCAAAGGAG 
GCAGCACTGA CCTGCTATGG CGTCTTTTCA GTCTACTCTA CCTAAAGTGC ACTACCATCT 
A GCAGTGAAAA CCTTTGTGAA AACTGAATTC 



QRLQDVIYDR DSLQRQLNSA LPQDIESLTG GLAGSKGADP PEFMLTKEL NACREQLLEK 
EEEISELKAE RNNTRLLLEH LECLVSRHER SLRMTWKRQ AQSPSGVSSE VEVLKALKSL 
FEHHKALDEK VRERLRVSLE RVSALEEELA AANQEIVALR EQNVHIQRKM ASSEGSTESE 
■ K VHEKRLSNGS IDSTDETSQI VELQELLEKQ NYEMAOMKBR LAALSSRVGE 
K DLIKTEEMNT KYQRDIREAM AQKEDMEERI TTLEKRY 



N KEAILRQMEE KNRQLQERLE LAEEKLQQTM RKAETLPSVE A 
TKAEETIIGNI EERMRIILEGO LEEKNQELQR ARQREKMNEE H 



QLHLKERMAA LEEKWVLIQE SETFRKKLEE SLHDKESLAE EIEKLHSELD QLKMRTGSLI 



NKEIRLIQEE KESTELRAEE IEHRVASVSL EGLNLAMVHP GTSITASVTA S5LASSSPPS 72 0 

GHSTPKLTPR SPAREMDRMG VMTLP3DLRK HRRKIAWEE DGREDKATIK CBTSPPPTPR 780 

ALRMTHTLPS SYHMDARSSL SVSLEPESLG LGSANSSQD3 LHKAPKKKGI KSSIGRLFGK 84 0 

KEKARLGQLR GFMETEAAAQ ESLGLGKLGT QAEKDRRLKK XHELL3EARR KGLPFAQWDG 900 

PTWAWLELK LCMPAWYVAA CRANVKSCAI MSALSDTEIQ REIGI3NPLH RLKLRL.MQE 350 

MVSLISPSAP PTSRTPSGNV WVTHEEMENL AAPAKTKESE EGSWAQCPVF LQTLAYGDMN 1020 

HEWIGNEWLP 3LGLPQYRSY FMECLVDARM LDHLTKKDLR VKLKMVDSFH RTSLQYGIMC 1080 

LKRLNYDRKE LERRREASQH EIKDVLWSN DRVIRWIQAI GLREYANNIL ESGVHGSLIA 1140 

LDENFDYSSL ALLLQIPTQN TQARQILERE YNNLLALGTE RRLDESDDKW FRRGSTWRRQ 1200 

FPPREVHGIS MMPG33ETLP AGFRLTTTSG QSRKMTTDVA SSRLQRLDNS TVRTYSC 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rather are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent applications cited in this specification are herein 
5 incorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 . A method of detecting an androgen-independent prostate cancer cell in a sample from 
a patient having undergone androgen ablation therapy, the method comprising determining 
the presence or absence of a nucleic acid comprising a sequence at least 80% identical to a 
sequence as shown in Tables 1A-4. 



2. The method of claim 1 , wherein said determining is by hybridizing with a 
polynucleotide that selectively hybridizes to a sequence at least 95% identical to a 
as shown in Tables 1A-4. 



3 . The method of claim 1 , wherein the biological sample: 

a) is a tissue sample; or 

b) comprises isolated nucleic acids. 

4 . The method of claim 3 : 

a) wherein the nucleic acids are mRNA; or 

b) further comprising the step of amplifying nucleic acids before the step of 

contacting the biological sample with the polynucleotide. 

5 . The method of claim 2 , wherein the polynucleotide : 

a) comprises a sequence as shown in Tables 1 A-4; 

b) is labeled, including a fluorescent label; or 

c) is immobilized on a solid surface. 

6. The method according to claim 1 ,wherein said biological sample is contacted with a 
plurality of polynucleotides that each selectively hybridizes to a sequence at least 95% 
identical to a first sequence as shown in Tables 1 A-4. 

7. The method according to claim 6,wherein said plurality of polynucleotides are 
immobilized on a solid surface. 
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8. An isolated polypeptide which is encoded by a nucleic acid molecule having 
polynucleotide sequence as shown in Tables 1A-4. 

9. An antibody that specifically binds a polypeptide of claim 8 . 

35 

10. The antibody of claim 9: 

a) further conjugated to an effector component, including a fluorescent label a 

radioisotope or a cytotoxic chemical; or 

b) which is an antibody fragment or humanized antibody. 

40 

11. A method of detecting an androgen-independent prostate cancer cell in a patient 
having undergone androgen ablation therapy, the method comprising contacting a samp 
from said patient with an antibody of claim 9. 

45 12. The method o f claim 1 1 , wherein: 

a) the antibody is further conjugated to an effector component, e.g., a fluorescei 

label; or . 

b) said sample comprises a cell. 

50 13. A method of detecting antibodies specific to androgen-independent prostate cam 
a patient having undergone androgen ablation, the method comprising contacting a biol< 
sample from the patient with a polypeptide encoded by a nucleic acid comprising a seqi 
from Tables 1A-4. 

55 14. A method of inliibiting proliferation of androgen-independent prostate cancer ce 
a patient having undergone androgen ablation therapy, the method comprising administi 
to the patient a therapeutically effective amount of a compound that specifically elimins 
cells expressing an antigen listed in Tables 1 A-4. 

60 15. The method of claim 14, wherein the compound is an antibody. 

16. A drug screening assay comprising the steps of: 
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a) administering a test compound to a mammal having a prostate proliferative 

condition or a cell isolated therefrom; 

b) comparing the level of gene expression of a polynucleotide that selectively 

hybridizes to a sequende at least 80% identical to a sequence as shown in 
Tables 1 A-4 in a treated cell or mammal with the level of gene expression of 
the polynucleotide in a control cell or mammal, wherein a test compound that 
modulates the level of expression of the polynucleotide is a candidate for the 
treatment of prostate cancer. 

17. The assay of claim 1 6, wherein: 

a) the control is a mammal with prostate cancer or a cell therefrom that has not been 

treated with the test compound; or 

b) the control is a normal cell or mammal. 

18. A method for treating a mammal having a prostate proliferative condition or prostate 
cancer comprising administering a compound identified by the assay of claim 16. 

19. A pharmaceutical composition for treating a mammal having a prostate proliferative 
condition or prostate cancer, the composition comprising a compoimd identified by the assay 
of claim 16 and a physiologically acceptable excipient. 

20. A method of detecting a prostate cancer associated transcript, the method comprising 
contacting a biological sample from the patient with a plurality of polynucleotides wherein at 
least two of said polynucleotides selectively hybridize to a difference sequence at least 80% 
identical to a sequence as shown in Tables 1A-4. 

21. A method of detecting a prostate cancer, the method comprising the steps of: 

a) providing a biological sample from a patient; 

b) contacting the biological sample with a first polynucleotide that selectively 

hybridizes to a sequence at least 80% identical to a first sequence as shown in 
Tables 1A-4, to determine the level of a prostate cancer-associated transcript 
in the biological sample; and with a second polynucleotide that selectively 
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95 



hybridizes to a second sequence at least 80% identical to a sequence not 
shown in Tables 1A-4; wherein the expression of said second sequence is not 
substantially changed in prostate cancer, to determine the level of expression 
of a control transcript in the biological sample; and 
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c) comparing the level of the prostate cancer-associated transcript to a level of the 
normal tissue associated transcript in the biological sample. 



22. A method for quantitation of a prostate cancer-associated transcript in a cell from a 
patient, the method comprising contacting a biological sample from the patient with a 
polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

105 as shown in Tables 1 A-4. 

23 . The method of claim 22, wherein: 

a) the polynucleotide selectively hybridizes to a sequence at least 95% identical to a 
sequence as shown in Tables 1A-4; 
1 10 b) the biological sample is a tissue sample; 



g) the polynucleotide is labeled, including a fluorescent label; or 

h) the polynucleotide is immobilized on a solid surface. 

24. A biochip comprising a plurality of polynucleotides that selectively hybridize to a 
120 sequence at least 80% identical to a sequence as shown in Tables 1 A-4. 

25 . A method of screening drug candidates comprising: 

a) providing a cell that expresses an expression profile gene selected from the group 
consisting of an expression profile gene set forth in Tables 1A-4 or fragment 



c) 
d) 
e) 



the biological sample comprises isolated nucleic acids; 
the nucleic acids are rnRNA; 

further comprising the step of amplifying nucleic acids before the step of 
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contacting the biological sample with the polynucleotide; 
the polynucleotide comprises a sequence as shown in Tables 1A-4; 
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thereof; 

b) adding a drug candidate to said cell; and 
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c) determining the effect of said drug candidate on the expression of said expression 
profile gene. 

130 26. A method according to claim 22 wherein said determining comprises comparing the 
level of expression in the absence of said drug candidate to the level of expression in the 
presence of said drug candidate. 
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